[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108800 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -39,13 +39,16 @@ private[spark] object PythonEvalType {

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108771 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108782 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -171,6 +171,7 @@ trait CheckAnalysis extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108798 --- Diff: python/pyspark/sql/tests.py --- @@ -4052,6 +4066,323 @@ def test_unsupported_types(self):

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108773 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -273,7 +274,7 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108765 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -92,8 +99,14 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r159108762 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -215,3 +228,49 @@ object ExtractPythonUDFs

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158952743 --- Diff: python/pyspark/sql/tests.py --- @@ -477,6 +502,7 @@ def test_udf_with_aggregate_function(self): sel =

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158952610 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -171,6 +171,7 @@ trait CheckAnalysis extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158952133 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -273,7 +274,7 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158951752 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158901221 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158912955 --- Diff: python/pyspark/sql/tests.py --- @@ -4052,6 +4066,323 @@ def test_unsupported_types(self): df.groupby('id').apply(f).collect()

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158913244 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -215,3 +228,49 @@ object ExtractPythonUDFs extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158907883 --- Diff: python/pyspark/sql/tests.py --- @@ -477,6 +502,7 @@ def test_udf_with_aggregate_function(self): sel =

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158912156 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -92,8 +99,14 @@ object ExtractPythonUDFFromAggregate

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158909734 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -171,6 +171,7 @@ trait CheckAnalysis extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158911077 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -273,7 +274,7 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158902463 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -39,13 +39,16 @@ private[spark] object PythonEvalType { val

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-27 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158872851 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-27 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158872825 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-27 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158872704 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-27 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158872655 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158423362 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-21 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158302645 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-21 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158301996 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157948426 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157939292 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157944969 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157944622 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157938453 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157931925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -15,10 +15,9 @@ * limitations under the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157931824 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157896109 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -113,6 +113,7 @@ object ExtractPythonUDFs

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157896062 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157895967 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157895765 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891787 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,9 +48,26 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891697 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,9 +48,26 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891668 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -32,7 +31,5 @@ case class PythonUDF(

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891477 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -437,6 +437,37 @@ class RelationalGroupedDataset

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891450 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -437,6 +437,37 @@ class RelationalGroupedDataset

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891391 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -38,3 +38,13 @@ case class

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891261 --- Diff: python/pyspark/sql/tests.py --- @@ -4016,6 +4016,89 @@ def test_unsupported_types(self): with

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891322 --- Diff: python/pyspark/sql/udf.py --- @@ -56,6 +56,10 @@ def _create_udf(f, returnType, evalType): return udf_obj._wrapped()

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157891206 --- Diff: python/pyspark/sql/functions.py --- @@ -2070,6 +2070,8 @@ class PandasUDFType(object): GROUP_MAP =

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157825297 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157823488 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -32,7 +31,5 @@ case class PythonUDF(

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157823295 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -15,10 +15,9 @@ * limitations under the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156037616 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,9 +48,26 @@ object ExtractPythonUDFFromAggregate

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156028776 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -32,7 +31,5 @@ case class PythonUDF(

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156034167 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156037157 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,9 +48,26 @@ object ExtractPythonUDFFromAggregate

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156031375 --- Diff: python/pyspark/sql/tests.py --- @@ -4016,6 +4016,124 @@ def test_unsupported_types(self): with self.assertRaisesRegexp(Exception,

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r156038036 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -15,10 +15,9 @@ * limitations under the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-05 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r155159185 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154809806 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154782452 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154644084 --- Diff: python/pyspark/sql/udf.py --- @@ -56,6 +56,10 @@ def _create_udf(f, returnType, evalType): return udf_obj._wrapped()

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154642230 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154642902 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -38,3 +38,13 @@ case class

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154644620 --- Diff: python/pyspark/sql/tests.py --- @@ -4016,6 +4016,89 @@ def test_unsupported_types(self): with

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154644235 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -437,6 +437,37 @@ class RelationalGroupedDataset

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154644340 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -437,6 +437,37 @@ class RelationalGroupedDataset

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154616454 --- Diff: python/pyspark/sql/functions.py --- @@ -2070,6 +2070,8 @@ class PandasUDFType(object): GROUP_MAP =

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-04 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154615728 --- Diff: python/pyspark/sql/udf.py --- @@ -56,6 +56,10 @@ def _create_udf(f, returnType, evalType): return udf_obj._wrapped()

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154569899 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154569953 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154569884 --- Diff: python/pyspark/sql/group.py --- @@ -89,8 +89,15 @@ def agg(self, *exprs): else: # Columns assert

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r154569177 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -113,6 +113,7 @@ object ExtractPythonUDFs extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-03 Thread icexelloss
GitHub user icexelloss opened a pull request: https://github.com/apache/spark/pull/19872 WIP: [SPARK-22274][PySpark] User-defined aggregation functions with pandas udf ## What changes were proposed in this pull request? Add support for pandas_udf in groupby().agg()