[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21427 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197527013 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -120,4 +121,19 @@ object ArrowUtils {

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197525704 --- Diff: python/pyspark/worker.py --- @@ -110,9 +116,20 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197524629 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,18 +58,18 @@ class ArrowPythonRunner(

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197510171 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,18 +58,18 @@ class ArrowPythonRunner(

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197509839 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -63,7 +64,7 @@ case class

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197509567 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -120,4 +121,19 @@ object ArrowUtils {

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r197508262 --- Diff: python/pyspark/worker.py --- @@ -110,9 +116,20 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196457433 --- Diff: python/pyspark/worker.py --- @@ -110,9 +116,20 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196442152 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -120,4 +121,19 @@ object ArrowUtils {

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196438132 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1161,6 +1161,16 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196437909 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -97,7 +98,7 @@ case class WindowInPandasExec(

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196437623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -77,7 +78,7 @@ case class

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196437348 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,18 +58,18 @@ class ArrowPythonRunner(

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196436658 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -63,7 +64,7 @@ case class

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196435526 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -120,4 +121,19 @@ object ArrowUtils {

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-18 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196243235 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1161,6 +1161,16 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-18 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196242012 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,18 +58,18 @@ class ArrowPythonRunner(

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-06-18 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r196241595 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1161,6 +1161,16 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191596459 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the returned

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-29 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191511343 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-29 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191503646 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-29 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191502476 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-29 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191502180 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,63 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf)

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-27 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191076477 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191070228 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191040210 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,63 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf)

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191037717 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the returned

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191016970 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191015609 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191015105 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,63 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf)

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191006873 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-25 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r191004141 --- Diff: python/pyspark/worker.py --- @@ -111,9 +114,16 @@ def wrapped(key_series, value_series): "Number of columns of the

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21427#discussion_r190793613 --- Diff: python/pyspark/sql/tests.py --- @@ -4931,6 +4931,33 @@ def foo3(key, pdf): expected4 = udf3.func((), pdf)

[GitHub] spark pull request #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF shou...

2018-05-24 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/21427 [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assign result columns by name ## What changes were proposed in this pull request? Currently, a `pandas_udf` of type