Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19607
Jenkins, retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19459#discussion_r149626358
--- Diff: python/pyspark/serializers.py ---
@@ -213,7 +213,15 @@ def __repr__(self):
return "ArrowSerializer"
-def _cr
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19607#discussion_r149661483
--- Diff: python/pyspark/sql/session.py ---
@@ -557,7 +577,13 @@ def createDataFrame(self, data, schema=None,
samplingRatio=None, verifySchema=Tr
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19702#discussion_r149863117
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -428,15 +417,9 @@ object
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19646#discussion_r149865739
--- Diff: python/pyspark/sql/tests.py ---
@@ -2592,6 +2592,21 @@ def test_create_dataframe_from_array_of_long(self):
df
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/19704
[SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix for createDataFrame from
pandas.DataFrame with timestamp
## What changes were proposed in this pull request?
This is a follow-up of #19646
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19646#discussion_r149867086
--- Diff: python/pyspark/sql/tests.py ---
@@ -2592,6 +2592,21 @@ def test_create_dataframe_from_array_of_long(self):
df
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19459#discussion_r149871432
--- Diff: python/pyspark/serializers.py ---
@@ -213,7 +213,15 @@ def __repr__(self):
return "ArrowSerializer"
-def _cr
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r149873461
--- Diff: python/pyspark/sql/udf.py ---
@@ -0,0 +1,136 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r149873412
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -23,14 +23,15 @@ import scala.collection.JavaConverters
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19704
Thanks for reviewing! merging to branch-2.2.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user ueshin closed the pull request at:
https://github.com/apache/spark/pull/19704
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19459#discussion_r149887860
--- Diff: python/pyspark/serializers.py ---
@@ -214,6 +214,14 @@ def __repr__(self):
def _create_batch(series):
+"""
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19702
LGTM pending tests.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19702#discussion_r149940418
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -982,7 +941,7 @@ class
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157931824
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157931925
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala
---
@@ -15,10 +15,9 @@
* limitations under the
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19872
@ramacode2014 Hi, I'm not sure why you received notifications from this PR,
but I guess you can unsubscribe by the "Unsubscribe" button in the right column
of this page. Sorry for th
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157938453
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -48,29 +48,46 @@ object
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157944622
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157939292
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -48,29 +48,46 @@ object
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157944969
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r157948426
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -48,29 +48,46 @@ object
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19884#discussion_r157958862
--- Diff: python/pyspark/sql/udf.py ---
@@ -33,6 +33,10 @@ def _wrap_function(sc, func, returnType):
def _create_udf(f, returnType, evalType
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19884#discussion_r157961467
--- Diff: python/pyspark/sql/udf.py ---
@@ -33,6 +33,10 @@ def _wrap_function(sc, func, returnType):
def _create_udf(f, returnType, evalType
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19884#discussion_r158205751
--- Diff: python/pyspark/sql/tests.py ---
@@ -3356,6 +3356,7 @@ def test_schema_conversion_roundtrip(self):
self.assertEquals(self.schema
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19884#discussion_r158206546
--- Diff: python/pyspark/sql/utils.py ---
@@ -110,3 +110,12 @@ def toJArray(gateway, jtype, arr):
for i in range(0, len(arr)):
jarr[i
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19884
Jenkins, retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18754#discussion_r158422348
--- Diff: python/pyspark/sql/types.py ---
@@ -1617,7 +1617,7 @@ def to_arrow_type(dt):
elif type(dt) == DoubleType:
arrow_type
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158423362
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20036#discussion_r158430481
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
---
@@ -118,9 +118,8 @@ case class Like(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20036#discussion_r158430491
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
---
@@ -194,9 +193,8 @@ case class RLike(left
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/20054
[SPARK-22874][PYSPARK][SQL] Modify checking pandas version to use
LooseVersion.
## What changes were proposed in this pull request?
Currently we check pandas version by capturing if
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20054#discussion_r158460651
--- Diff: python/pyspark/sql/utils.py ---
@@ -112,6 +112,15 @@ def toJArray(gateway, jtype, arr):
return jarr
+def
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18754#discussion_r158620106
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala
---
@@ -214,6 +216,22 @@ private[arrow] class DoubleWriter(val
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/20074
[SPARK-22874][PYSPARK][SQL][FOLLOW-UP] Modify error messages to show actual
versions.
## What changes were proposed in this pull request?
This is a follow-up pr of #20054 modifying error
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20074#discussion_r158622968
--- Diff: python/pyspark/sql/utils.py ---
@@ -118,7 +118,8 @@ def require_minimum_pandas_version():
from distutils.version import LooseVersion
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20074
Thanks for reviewing! merging to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19954#discussion_r158675633
--- Diff:
resource-managers/kubernetes/docker/src/main/dockerfiles/init-container/Dockerfile
---
@@ -0,0 +1,24 @@
+#
+# Licensed to the Apache
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/20089
[SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py file.
## What changes were proposed in this pull request?
This is a follow-up pr of #19884 updating setup.py file to add pyarrow
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20089
Btw, should we add `'Programming Language :: Python :: 3.6'` to
`classifiers`?
---
-
To unsubscribe, e-mail: review
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20089
@HyukjinKwon Thanks! I'll add it soon.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional com
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20089
@HyukjinKwon I'll update it as well.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional com
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20059#discussion_r158767810
--- Diff: docs/running-on-kubernetes.md ---
@@ -528,51 +576,91 @@ specific to Spark on Kubernetes
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20059#discussion_r158766743
--- Diff: docs/running-on-kubernetes.md ---
@@ -120,6 +120,54 @@ by their appropriate remote URIs. Also, application
dependencies can be pre-moun
Those
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20089#discussion_r158789947
--- Diff: python/README.md ---
@@ -29,4 +29,4 @@ The Python packaging for Spark is not intended to replace
all of the other use c
## Python
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20089#discussion_r158799303
--- Diff: python/README.md ---
@@ -29,4 +29,4 @@ The Python packaging for Spark is not intended to replace
all of the other use c
## Python
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19954
Thanks! merging to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20059
Thanks! merging to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158909734
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
---
@@ -171,6 +171,7 @@ trait CheckAnalysis extends
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158911077
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -273,7 +274,7 @@ abstract class SparkStrategies extends
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158902463
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -39,13 +39,16 @@ private[spark] object PythonEvalType {
val
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158912156
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -92,8 +99,14 @@ object ExtractPythonUDFFromAggregate
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158907883
--- Diff: python/pyspark/sql/tests.py ---
@@ -477,6 +502,7 @@ def test_udf_with_aggregate_function(self):
sel = df.groupBy(my_copy(col(&quo
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158901221
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala
---
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158913244
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -215,3 +228,49 @@ object ExtractPythonUDFs extends
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r158912955
--- Diff: python/pyspark/sql/tests.py ---
@@ -4052,6 +4066,323 @@ def test_unsupported_types(self):
df.groupby('id').apply(
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20110
LGTM for the change, but I'm not sure whether the test was indeed triggered
or not.
---
-
To unsubscribe, e-mail: re
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20110
I confirmed the test came to pass after the patch in my local environment.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20110
Thanks! merging to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20114
How about simply returning `false` from `ArrowVectorAccessor.isNullAt(int
rowId)` when `accessor.getValueCount() > 0 &&
accessor.getValidityBuffer().capacity() == 0` without modifying
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20114
Jenkins, retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/20115
[SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test failure when xmlrunner is
installed.
## What changes were proposed in this pull request?
This is a follow-up pr of #19587.
If
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19792
Jenkins, retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19792
@HyukjinKwon Thanks, I'll take another look soon.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.or
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19792
LGTM, pending Jenkins.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20151
The changes LGTM.
Btw, what if we miss the module in python path? Can we see that the error
is caused by the missing module from the exception message
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/19792
Thanks! merging to master/2.3.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/14180
@gatorsmile @jiangxb1987 Maybe we should review and merge #13599 first
because this pr is based on it.
---
-
To unsubscribe, e
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20151
Looks good. Let's wait for @rxin's response.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160085587
--- Diff: docs/submitting-applications.md ---
@@ -218,6 +218,73 @@ These commands can be used with `pyspark`,
`spark-shell`, and `spark-submit` to
For
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160088373
--- Diff: python/pyspark/context.py ---
@@ -1023,6 +1039,33 @@ def getConf(self):
conf.setAll(self._conf.getAll())
return conf
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160087986
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160088297
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160093389
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160093249
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160087899
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160091411
--- Diff: docs/submitting-applications.md ---
@@ -218,6 +218,73 @@ These commands can be used with `pyspark`,
`spark-shell`, and `spark-submit` to
For
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160085499
--- Diff: docs/submitting-applications.md ---
@@ -218,6 +218,73 @@ These commands can be used with `pyspark`,
`spark-shell`, and `spark-submit` to
For
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160083009
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160081357
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160090598
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160079462
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160083172
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160081109
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160081322
--- Diff:
core/src/main/scala/org/apache/spark/api/python/VirtualEnvFactory.scala ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/13599#discussion_r160088202
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
---
@@ -299,20 +301,39 @@
// 4. environment variable
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20171#discussion_r160318424
--- Diff: python/pyspark/sql/tests.py ---
@@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self):
bool_f(col('
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20171#discussion_r160318804
--- Diff: python/pyspark/sql/tests.py ---
@@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self):
bool_f(col('
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20171#discussion_r160318854
--- Diff: python/pyspark/sql/tests.py ---
@@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self):
bool_f(col('
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20171#discussion_r160319860
--- Diff: python/pyspark/sql/tests.py ---
@@ -3616,6 +3616,34 @@ def test_vectorized_udf_basic(self):
bool_f(col('
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20163
I investigated the behavior differences between `udf` and `pandas_udf` for
the wrong return types and found there are many differences actually.
Basically `udf`s return `null` as @HyukjinKwon
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20163#discussion_r160364055
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
---
@@ -120,10 +121,18 @@ object EvaluatePython
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20213
LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20213#discussion_r160588603
--- Diff: python/pyspark/sql/session.py ---
@@ -459,21 +459,23 @@ def _convert_from_pandas(self, pdf, schema, timezone):
# TODO
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20213
Thanks! merging to master/2.3.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20210
LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20211#discussion_r160605967
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -457,13 +458,26 @@ class RelationalGroupedDataset protected[sql
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20211#discussion_r160599447
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala
---
@@ -80,27 +84,77 @@ case class
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20214#discussion_r160613124
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
---
@@ -1255,6 +1255,34 @@ class DataFrameSuite extends QueryTest with
101 - 200 of 1149 matches
Mail list logo