[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 LGTM for the change, but I'm not sure whether the test was indeed triggered or not. --- - To unsubscribe, e-mail: re

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 I confirmed the test came to pass after the patch in my local environment. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20114: [SPARK-22530][PYTHON][SQL] Adding Arrow support for Arra...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20114 How about simply returning `false` from `ArrowVectorAccessor.isNullAt(int rowId)` when `accessor.getValueCount() > 0 && accessor.getValidityBuffer().capacity() == 0` without modifying

[GitHub] spark issue #20114: [SPARK-22530][PYTHON][SQL] Adding Arrow support for Arra...

2017-12-29 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20114 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20115: [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test...

2017-12-29 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/20115 [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test failure when xmlrunner is installed. ## What changes were proposed in this pull request? This is a follow-up pr of #19587. If

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 @HyukjinKwon Thanks, I'll take another look soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.or

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 LGTM, pending Jenkins. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20151 The changes LGTM. Btw, what if we miss the module in python path? Can we see that the error is caused by the missing module from the exception message

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 Thanks! merging to master/2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Support for deploying Anaconda an...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/14180 @gatorsmile @jiangxb1987 Maybe we should review and merge #13599 first because this pr is based on it. --- - To unsubscribe, e

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20151 Looks good. Let's wait for @rxin's response. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160085587 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088373 --- Diff: python/pyspark/context.py --- @@ -1023,6 +1039,33 @@ def getConf(self): conf.setAll(self._conf.getAll()) return conf

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160087986 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088297 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160093389 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160093249 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160087899 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160091411 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160085499 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160083009 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081357 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160090598 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160079462 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160083172 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081109 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081322 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088202 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318424 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318804 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318854 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160319860 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark issue #20163: [SPARK-22966][PySpark] Spark SQL should handle Python UD...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20163 I investigated the behavior differences between `udf` and `pandas_udf` for the wrong return types and found there are many differences actually. Basically `udf`s return `null` as @HyukjinKwon

[GitHub] spark pull request #20163: [SPARK-22966][PySpark] Spark SQL should handle Py...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20163#discussion_r160364055 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala --- @@ -120,10 +121,18 @@ object EvaluatePython

[GitHub] spark issue #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pandas ti...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20213 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pa...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20213#discussion_r160588603 --- Diff: python/pyspark/sql/session.py --- @@ -459,21 +459,23 @@ def _convert_from_pandas(self, pdf, schema, timezone): # TODO

[GitHub] spark issue #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pandas ti...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20213 Thanks! merging to master/2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20210: [SPARK-23009][PYTHON] Fix for non-str col names to creat...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20210 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r160605967 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -457,13 +458,26 @@ class RelationalGroupedDataset protected[sql

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r160599447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -80,27 +84,77 @@ case class

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r160613124 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1255,6 +1255,34 @@ class DataFrameSuite extends QueryTest with

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r160616235 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -15,12 +15,30 @@ * limitations under the

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r160620400 --- Diff: python/pyspark/sql/tests.py --- @@ -511,7 +517,6 @@ def test_udf_with_order_by_and_limit(self): my_copy = udf(lambda x: x, IntegerType

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r160617597 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,152 @@ +/* + * Licensed to the

[GitHub] spark pull request #20217: [SPARK-23026] [PySpark] Add RegisterUDF to PySpar...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20217#discussion_r160872774 --- Diff: python/pyspark/sql/catalog.py --- @@ -255,26 +255,67 @@ def registerFunction(self, name, f, returnType=StringType

[GitHub] spark pull request #20217: [SPARK-23026] [PySpark] Add RegisterUDF to PySpar...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20217#discussion_r160873927 --- Diff: python/pyspark/sql/context.py --- @@ -578,6 +606,9 @@ def __init__(self, sqlContext): def register(self, name, f, returnType=StringType

[GitHub] spark pull request #20217: [SPARK-23026] [PySpark] Add RegisterUDF to PySpar...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20217#discussion_r160873545 --- Diff: python/pyspark/sql/context.py --- @@ -203,18 +203,46 @@ def registerFunction(self, name, f, returnType=StringType

[GitHub] spark issue #20222: [SPARK-23028] Bump master branch version to 2.4.0-SNAPSH...

2018-01-11 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20222 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r161123821 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,13 +237,19 @@ class Dataset[T] private[sql]( private[sql] def

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r161123864 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,13 +237,19 @@ class Dataset[T] private[sql]( private[sql] def

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r161123911 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,13 +237,19 @@ class Dataset[T] private[sql]( private[sql] def

[GitHub] spark issue #20239: [SPARK-23047][PYTHON][SQL] Change MapVector to NullableM...

2018-01-11 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20239 I'm not sure we can change to `NullableMapVector` and I'm just worrying whether the `MapVector` is never happened here. --- --

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r161146793 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -457,13 +458,26 @@ class RelationalGroupedDataset protected[sql

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r161153123 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,13 +237,18 @@ class Dataset[T] private[sql]( private[sql] def

[GitHub] spark pull request #20204: [SPARK-7721][PYTHON][TESTS] Adds PySpark coverage...

2018-01-11 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20204#discussion_r161155869 --- Diff: python/run-tests-with-coverage --- @@ -0,0 +1,69 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark issue #20239: [SPARK-23047][PYTHON][SQL] Change MapVector to NullableM...

2018-01-11 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20239 Btw, I don't mean to block this pr but why does only `MapVector` have `Nullable` version, just out of curiosity. --- ---

[GitHub] spark issue #20246: [SPARK-23054][SQL] Fix incorrect results of casting User...

2018-01-12 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20246 @maropu I guess you should elaborate the problem of the result string in the description. The changes LGTM pending Jenkins, btw

[GitHub] spark issue #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregation func...

2018-01-14 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19872 LGTM. @HyukjinKwon Do you have any concerns about this? I'd also cc @cloud-fan for another look. --- - To unsubscri

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161651611 --- Diff: python/pyspark/sql/catalog.py --- @@ -256,27 +258,58 @@ def registerFunction(self, name, f, returnType=StringType()): >>>

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161654514 --- Diff: python/pyspark/sql/catalog.py --- @@ -256,27 +258,58 @@ def registerFunction(self, name, f, returnType=StringType()): >>>

[GitHub] spark pull request #20204: [SPARK-7721][PYTHON][TESTS] Adds PySpark coverage...

2018-01-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20204#discussion_r161655584 --- Diff: python/run-tests-with-coverage --- @@ -0,0 +1,69 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161659245 --- Diff: python/pyspark/sql/catalog.py --- @@ -256,27 +258,58 @@ def registerFunction(self, name, f, returnType=StringType()): >>>

[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

2018-01-15 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20258 Is this similar to `Dataset.transform()` in Java/Scala API? But we don't have similar APIs for RDDs. --- - To unsubscri

[GitHub] spark pull request #20204: [SPARK-7721][PYTHON][TESTS] Adds PySpark coverage...

2018-01-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20204#discussion_r161679707 --- Diff: python/run-tests-with-coverage --- @@ -0,0 +1,69 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161711980 --- Diff: python/pyspark/sql/tests.py --- @@ -4037,6 +4082,15 @@ def test_simple(self): expected = df.toPandas().groupby('id&#x

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161709870 --- Diff: python/pyspark/sql/catalog.py --- @@ -226,18 +226,23 @@ def dropGlobalTempView(self, viewName): @ignore_unicode_prefix

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161711905 --- Diff: python/pyspark/sql/tests.py --- @@ -3975,33 +4003,50 @@ def test_vectorized_udf_timestamps_respect_session_timezone(self): finally

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r161710081 --- Diff: python/pyspark/sql/context.py --- @@ -174,18 +174,23 @@ def range(self, start, end=None, step=1, numPartitions=None

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r161716594 --- Diff: python/pyspark/sql/group.py --- @@ -233,6 +233,27 @@ def apply(self, udf): | 2| 1.1094003924504583

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r161765993 --- Diff: python/pyspark/sql/group.py --- @@ -233,6 +233,27 @@ def apply(self, udf): | 2| 1.1094003924504583

[GitHub] spark issue #21037: [SPARK-23919][SQL] Add array_position function

2018-04-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/21037 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21008: [SPARK-23902][SQL] Add roundOff flag to months_be...

2018-04-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21008#discussion_r182625871 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala --- @@ -453,34 +453,45 @@ class

[GitHub] spark pull request #21008: [SPARK-23902][SQL] Add roundOff flag to months_be...

2018-04-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21008#discussion_r182625943 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -870,24 +870,14 @@ object DateTimeUtils { * If

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182663898 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -505,3 +506,150 @@ case class ArrayMax

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182674039 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -417,3 +418,156 @@ case class ArrayMax

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182671380 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -169,4 +169,45 @@ class

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182672237 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -505,3 +506,150 @@ case class ArrayMax

[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r182680902 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class

[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r182689303 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class

[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r182686266 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -106,6 +106,30 @@ class

[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r182682158 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class

[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21028#discussion_r182688229 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -288,6 +288,114 @@ case class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182705280 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -105,4 +105,28 @@ class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182701040 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182702693 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182706982 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -102,6 +102,12 @@ trait

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182701319 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182701643 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class

[GitHub] spark pull request #21040: [SPARK-23930][SQL] Add slice function

2018-04-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21040#discussion_r182703273 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,101 @@ case class

[GitHub] spark issue #21053: [SPARK-23924][SQL] Add element_at function

2018-04-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/21053 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20938 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20858 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182973891 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -169,4 +169,45 @@ class

[GitHub] spark issue #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20938 LGTM pending Jenkins. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r182990028 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala --- @@ -169,4 +169,45 @@ class

[GitHub] spark pull request #21021: [SPARK-23921][SQL] Add array_sort function

2018-04-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21021#discussion_r182998415 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -117,47 +117,18 @@ case class MapValues

[GitHub] spark pull request #21021: [SPARK-23921][SQL] Add array_sort function

2018-04-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21021#discussion_r182993016 --- Diff: python/pyspark/sql/functions.py --- @@ -2168,6 +2171,23 @@ def sort_array(col, asc=True): return Column(sc._jvm.functions.sort_array

[GitHub] spark pull request #21021: [SPARK-23921][SQL] Add array_sort function

2018-04-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21021#discussion_r182999463 --- Diff: python/pyspark/sql/functions.py --- @@ -2168,6 +2171,23 @@ def sort_array(col, asc=True): return Column(sc._jvm.functions.sort_array

<    1   2   3   4   5   6   7   8   9   10   >