dtenedor commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1350762771
##
python/pyspark/sql/tests/test_udtf.py:
##
@@ -2309,6 +2309,55 @@ def terminate(self):
+ [Row(partition_col=42, count=3, total=3, last=None)],
dtenedor commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1348084412
##
python/pyspark/sql/udtf.py:
##
@@ -107,12 +107,20 @@ class AnalyzeResult:
If non-empty, this is a sequence of columns that the UDTF is
specifying for Cata
dtenedor commented on PR #43204:
URL: https://github.com/apache/spark/pull/43204#issuecomment-1753741579
Hi @allisonwang-db @ueshin thanks for your reviews, these were good
comments, please look again! I think the new API is better now.
--
This is an automated message from the Apache Git
dtenedor commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1350763173
##
python/pyspark/worker.py:
##
@@ -786,6 +787,24 @@ def _remove_partition_by_exprs(self, arg: Any) -> Any:
else:
return arg
+# Wra
dtenedor commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1350763041
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/PythonUDF.scala:
##
@@ -167,22 +169,26 @@ abstract class UnevaluableGenerator extends Generato
allisonwang-db commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1347981218
##
python/pyspark/sql/tests/test_udtf.py:
##
@@ -2309,6 +2309,55 @@ def terminate(self):
+ [Row(partition_col=42, count=3, total=3, last=None)],
ueshin commented on code in PR #43204:
URL: https://github.com/apache/spark/pull/43204#discussion_r1346427159
##
sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala:
##
@@ -290,6 +295,20 @@ object UserDefinedPythonTableFunction {
HyukjinKwon commented on PR #43204:
URL: https://github.com/apache/spark/pull/43204#issuecomment-1746074413
Implementation seems fine from a cursory look, but let me defer to
@allisonwang-db and @ueshin for the design.
--
This is an automated message from the Apache Git Service.
To respon
dtenedor commented on PR #43204:
URL: https://github.com/apache/spark/pull/43204#issuecomment-1745840699
cc @ueshin @allisonwang-db @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
dtenedor opened a new pull request, #43204:
URL: https://github.com/apache/spark/pull/43204
### What changes were proposed in this pull request?
This PR adds a Python UDTF API for 'analyze' to return a buffer to consume
on each class creation.
* The `AnalyzeResult` class now co
10 matches
Mail list logo