[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-08 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r149626358 --- Diff: python/pyspark/serializers.py --- @@ -213,7 +213,15 @@ def __repr__(self): return "ArrowSerializer" -def _cr

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r149661483 --- Diff: python/pyspark/sql/session.py --- @@ -557,7 +577,13 @@ def createDataFrame(self, data, schema=None, samplingRatio=None, verifySchema=Tr

[GitHub] spark pull request #19702: [SPARK-10365][SQL] Support Parquet logical type T...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19702#discussion_r149863117 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -428,15 +417,9 @@ object

[GitHub] spark pull request #19646: [SPARK-22417][PYTHON] Fix for createDataFrame fro...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19646#discussion_r149865739 --- Diff: python/pyspark/sql/tests.py --- @@ -2592,6 +2592,21 @@ def test_create_dataframe_from_array_of_long(self): df

[GitHub] spark pull request #19704: [SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix f...

2017-11-08 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/19704 [SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix for createDataFrame from pandas.DataFrame with timestamp ## What changes were proposed in this pull request? This is a follow-up of #19646

[GitHub] spark pull request #19646: [SPARK-22417][PYTHON] Fix for createDataFrame fro...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19646#discussion_r149867086 --- Diff: python/pyspark/sql/tests.py --- @@ -2592,6 +2592,21 @@ def test_create_dataframe_from_array_of_long(self): df

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r149871432 --- Diff: python/pyspark/serializers.py --- @@ -213,7 +213,15 @@ def __repr__(self): return "ArrowSerializer" -def _cr

[GitHub] spark pull request #19630: wip: [SPARK-22409] Introduce function type argume...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r149873461 --- Diff: python/pyspark/sql/udf.py --- @@ -0,0 +1,136 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] spark pull request #19630: wip: [SPARK-22409] Introduce function type argume...

2017-11-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r149873412 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -23,14 +23,15 @@ import scala.collection.JavaConverters

[GitHub] spark issue #19704: [SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix for crea...

2017-11-08 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19704 Thanks for reviewing! merging to branch-2.2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19704: [SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix f...

2017-11-08 Thread ueshin
Github user ueshin closed the pull request at: https://github.com/apache/spark/pull/19704 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-11-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19459#discussion_r149887860 --- Diff: python/pyspark/serializers.py --- @@ -214,6 +214,14 @@ def __repr__(self): def _create_batch(series): +"""

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19702 LGTM pending tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19702: [SPARK-10365][SQL] Support Parquet logical type T...

2017-11-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19702#discussion_r149940418 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -982,7 +941,7 @@ class

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157931824 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157931925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala --- @@ -15,10 +15,9 @@ * limitations under the

[GitHub] spark issue #19872: WIP: [SPARK-22274][PySpark] User-defined aggregation fun...

2017-12-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19872 @ramacode2014 Hi, I'm not sure why you received notifications from this PR, but I guess you can unsubscribe by the "Unsubscribe" button in the right column of this page. Sorry for th

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157938453 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157944622 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157939292 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157944969 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r157948426 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -48,29 +48,46 @@ object

[GitHub] spark pull request #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0

2017-12-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19884#discussion_r157958862 --- Diff: python/pyspark/sql/udf.py --- @@ -33,6 +33,10 @@ def _wrap_function(sc, func, returnType): def _create_udf(f, returnType, evalType

[GitHub] spark pull request #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0

2017-12-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19884#discussion_r157961467 --- Diff: python/pyspark/sql/udf.py --- @@ -33,6 +33,10 @@ def _wrap_function(sc, func, returnType): def _create_udf(f, returnType, evalType

[GitHub] spark pull request #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0

2017-12-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19884#discussion_r158205751 --- Diff: python/pyspark/sql/tests.py --- @@ -3356,6 +3356,7 @@ def test_schema_conversion_roundtrip(self): self.assertEquals(self.schema

[GitHub] spark pull request #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0

2017-12-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19884#discussion_r158206546 --- Diff: python/pyspark/sql/utils.py --- @@ -110,3 +110,12 @@ def toJArray(gateway, jtype, arr): for i in range(0, len(arr)): jarr[i

[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0

2017-12-21 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19884 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #18754: [WIP][SPARK-21552][SQL] Add DecimalType support t...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18754#discussion_r158422348 --- Diff: python/pyspark/sql/types.py --- @@ -1617,7 +1617,7 @@ def to_arrow_type(dt): elif type(dt) == DoubleType: arrow_type

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158423362 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the

[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20036#discussion_r158430481 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -118,9 +118,8 @@ case class Like(left

[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20036#discussion_r158430491 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -194,9 +193,8 @@ case class RLike(left

[GitHub] spark pull request #20054: [SPARK-22874][PYSPARK][SQL] Modify checking panda...

2017-12-22 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/20054 [SPARK-22874][PYSPARK][SQL] Modify checking pandas version to use LooseVersion. ## What changes were proposed in this pull request? Currently we check pandas version by capturing if

[GitHub] spark pull request #20054: [SPARK-22874][PYSPARK][SQL] Modify checking panda...

2017-12-22 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20054#discussion_r158460651 --- Diff: python/pyspark/sql/utils.py --- @@ -112,6 +112,15 @@ def toJArray(gateway, jtype, arr): return jarr +def

[GitHub] spark pull request #18754: [SPARK-21552][SQL] Add DecimalType support to Arr...

2017-12-24 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18754#discussion_r158620106 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala --- @@ -214,6 +216,22 @@ private[arrow] class DoubleWriter(val

[GitHub] spark pull request #20074: [SPARK-22874][PYSPARK][SQL][FOLLOW-UP] Modify err...

2017-12-24 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/20074 [SPARK-22874][PYSPARK][SQL][FOLLOW-UP] Modify error messages to show actual versions. ## What changes were proposed in this pull request? This is a follow-up pr of #20054 modifying error

[GitHub] spark pull request #20074: [SPARK-22874][PYSPARK][SQL][FOLLOW-UP] Modify err...

2017-12-24 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20074#discussion_r158622968 --- Diff: python/pyspark/sql/utils.py --- @@ -118,7 +118,8 @@ def require_minimum_pandas_version(): from distutils.version import LooseVersion

[GitHub] spark issue #20074: [SPARK-22874][PYSPARK][SQL][FOLLOW-UP] Modify error mess...

2017-12-25 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20074 Thanks for reviewing! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19954: [SPARK-22757][Kubernetes] Enable use of remote de...

2017-12-25 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19954#discussion_r158675633 --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/init-container/Dockerfile --- @@ -0,0 +1,24 @@ +# +# Licensed to the Apache

[GitHub] spark pull request #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setu...

2017-12-26 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/20089 [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py file. ## What changes were proposed in this pull request? This is a follow-up pr of #19884 updating setup.py file to add pyarrow

[GitHub] spark issue #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py fi...

2017-12-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20089 Btw, should we add `'Programming Language :: Python :: 3.6'` to `classifiers`? --- - To unsubscribe, e-mail: review

[GitHub] spark issue #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py fi...

2017-12-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20089 @HyukjinKwon Thanks! I'll add it soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional com

[GitHub] spark issue #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py fi...

2017-12-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20089 @HyukjinKwon I'll update it as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional com

[GitHub] spark pull request #20059: [SPARK-22648][K8s] Add documentation covering ini...

2017-12-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20059#discussion_r158767810 --- Diff: docs/running-on-kubernetes.md --- @@ -528,51 +576,91 @@ specific to Spark on Kubernetes

[GitHub] spark pull request #20059: [SPARK-22648][K8s] Add documentation covering ini...

2017-12-26 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20059#discussion_r158766743 --- Diff: docs/running-on-kubernetes.md --- @@ -120,6 +120,54 @@ by their appropriate remote URIs. Also, application dependencies can be pre-moun Those

[GitHub] spark pull request #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setu...

2017-12-27 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20089#discussion_r158789947 --- Diff: python/README.md --- @@ -29,4 +29,4 @@ The Python packaging for Spark is not intended to replace all of the other use c ## Python

[GitHub] spark pull request #20089: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setu...

2017-12-27 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20089#discussion_r158799303 --- Diff: python/README.md --- @@ -29,4 +29,4 @@ The Python packaging for Spark is not intended to replace all of the other use c ## Python

[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-27 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19954 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20059: [SPARK-22648][K8s] Add documentation covering init conta...

2017-12-27 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20059 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158909734 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -171,6 +171,7 @@ trait CheckAnalysis extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158911077 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -273,7 +274,7 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158902463 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -39,13 +39,16 @@ private[spark] object PythonEvalType { val

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158912156 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -92,8 +99,14 @@ object ExtractPythonUDFFromAggregate

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158907883 --- Diff: python/pyspark/sql/tests.py --- @@ -477,6 +502,7 @@ def test_udf_with_aggregate_function(self): sel = df.groupBy(my_copy(col(&quo

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158901221 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -0,0 +1,140 @@ +/* + * Licensed to the

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158913244 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -215,3 +228,49 @@ object ExtractPythonUDFs extends

[GitHub] spark pull request #19872: WIP: [SPARK-22274][PySpark] User-defined aggregat...

2017-12-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r158912955 --- Diff: python/pyspark/sql/tests.py --- @@ -4052,6 +4066,323 @@ def test_unsupported_types(self): df.groupby('id').apply(

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 LGTM for the change, but I'm not sure whether the test was indeed triggered or not. --- - To unsubscribe, e-mail: re

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 I confirmed the test came to pass after the patch in my local environment. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20110: [SPARK-22313][PYTHON][FOLLOWUP] Explicitly import warnin...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20110 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20114: [SPARK-22530][PYTHON][SQL] Adding Arrow support for Arra...

2017-12-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20114 How about simply returning `false` from `ArrowVectorAccessor.isNullAt(int rowId)` when `accessor.getValueCount() > 0 && accessor.getValidityBuffer().capacity() == 0` without modifying

[GitHub] spark issue #20114: [SPARK-22530][PYTHON][SQL] Adding Arrow support for Arra...

2017-12-29 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20114 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20115: [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test...

2017-12-29 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/20115 [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test failure when xmlrunner is installed. ## What changes were proposed in this pull request? This is a follow-up pr of #19587. If

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 @HyukjinKwon Thanks, I'll take another look soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.or

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 LGTM, pending Jenkins. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20151 The changes LGTM. Btw, what if we miss the module in python path? Can we see that the error is caused by the missing module from the exception message

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19792 Thanks! merging to master/2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Support for deploying Anaconda an...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/14180 @gatorsmile @jiangxb1987 Maybe we should review and merge #13599 first because this pr is based on it. --- - To unsubscribe, e

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20151 Looks good. Let's wait for @rxin's response. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160085587 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088373 --- Diff: python/pyspark/context.py --- @@ -1023,6 +1039,33 @@ def getConf(self): conf.setAll(self._conf.getAll()) return conf

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160087986 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088297 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160093389 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160093249 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160087899 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160091411 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160085499 --- Diff: docs/submitting-applications.md --- @@ -218,6 +218,73 @@ These commands can be used with `pyspark`, `spark-shell`, and `spark-submit` to For

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160083009 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081357 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160090598 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160079462 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160083172 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081109 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160081322 --- Diff: core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r160088202 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -299,20 +301,39 @@ // 4. environment variable

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318424 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318804 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160318854 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark pull request #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs ...

2018-01-08 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20171#discussion_r160319860 --- Diff: python/pyspark/sql/tests.py --- @@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self): bool_f(col('

[GitHub] spark issue #20163: [SPARK-22966][PySpark] Spark SQL should handle Python UD...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20163 I investigated the behavior differences between `udf` and `pandas_udf` for the wrong return types and found there are many differences actually. Basically `udf`s return `null` as @HyukjinKwon

[GitHub] spark pull request #20163: [SPARK-22966][PySpark] Spark SQL should handle Py...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20163#discussion_r160364055 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala --- @@ -120,10 +121,18 @@ object EvaluatePython

[GitHub] spark issue #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pandas ti...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20213 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pa...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20213#discussion_r160588603 --- Diff: python/pyspark/sql/session.py --- @@ -459,21 +459,23 @@ def _convert_from_pandas(self, pdf, schema, timezone): # TODO

[GitHub] spark issue #20213: [SPARK-23018][PYTHON] Fix createDataFrame from Pandas ti...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20213 Thanks! merging to master/2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20210: [SPARK-23009][PYTHON] Fix for non-str col names to creat...

2018-01-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20210 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r160605967 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -457,13 +458,26 @@ class RelationalGroupedDataset protected[sql

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20211#discussion_r160599447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala --- @@ -80,27 +84,77 @@ case class

[GitHub] spark pull request #20214: [SPARK-23023][SQL] Cast field data to strings in ...

2018-01-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20214#discussion_r160613124 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1255,6 +1255,34 @@ class DataFrameSuite extends QueryTest with

<    1   2   3   4   5   6   7   8   9   10   >