ukby1234 commented on PR #42155:
URL: https://github.com/apache/spark/pull/42155#issuecomment-1665042187
It's been a while since I opened this pull request. Can I get someone to
review this PR? cc @mridulm
--
This is an automated message from the Apache Git Service.
To respond to the
HyukjinKwon commented on PR #42339:
URL: https://github.com/apache/spark/pull/42339#issuecomment-1665033273
cc @xinrong-meng too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #42206: [SPARK-44582][SQL] Skip iterator on SMJ
if it was cleaned up
URL: https://github.com/apache/spark/pull/42206
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
HyukjinKwon commented on PR #42206:
URL: https://github.com/apache/spark/pull/42206#issuecomment-1665029179
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
Madhukar98 opened a new pull request, #42339:
URL: https://github.com/apache/spark/pull/42339
### What changes were proposed in this pull request?
The fix is to use openpyxl by default instead of xlrd.
### Why are the changes needed?
test_to_excel test case was
bersprockets commented on PR #42206:
URL: https://github.com/apache/spark/pull/42206#issuecomment-1664996034
Thanks. Looks good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #42338:
URL: https://github.com/apache/spark/pull/42338#issuecomment-1664963753
cc @juliuszsompolski @zhengruifeng @ueshin Please take a look
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HyukjinKwon opened a new pull request, #42338:
URL: https://github.com/apache/spark/pull/42338
### What changes were proposed in this pull request?
The fix for the symmetry to https://github.com/apache/spark/pull/42282.
### Why are the changes needed?
See also
hvanhovell commented on PR #42331:
URL: https://github.com/apache/spark/pull/42331#issuecomment-1664950906
A bit of a monkey wrench. I am fine with the current approach. I am just
wondering if at this point using the GRPC iterators is the easiest? Would it be
easier to use a stream
pan3793 commented on PR #42336:
URL: https://github.com/apache/spark/pull/42336#issuecomment-1664948047
cc @wangyum @ulysses-you @yaooqinn
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
allisonwang-db commented on PR #42302:
URL: https://github.com/apache/spark/pull/42302#issuecomment-1664945741
Yup we need this in branch-3.5. Created
https://github.com/apache/spark/pull/42337
--
This is an automated message from the Apache Git Service.
To respond to the message, please
allisonwang-db opened a new pull request, #42337:
URL: https://github.com/apache/spark/pull/42337
…
This PR improves error messages when the result of a Python UDTF is not an
Iterable. It also improves the error messages when a UDTF encounters an
exception when executing `eval`.
pan3793 opened a new pull request, #42336:
URL: https://github.com/apache/spark/pull/42336
### What changes were proposed in this pull request?
Add file extensions for Parquet/ORC files written using Hive Serde, to keep
behavior consistent with Spark DataSource
liangyu-1 commented on PR #42295:
URL: https://github.com/apache/spark/pull/42295#issuecomment-1664932235
> The staging directory is cleaned automatically by Spark, why do you even
need this hook?
@yaooqinn
Spark cleans the staging directory in this Hook, in spark 2.4
HyukjinKwon commented on PR #42118:
URL: https://github.com/apache/spark/pull/42118#issuecomment-1664932454
@mathewjacob1002 and @maddiedawson can you follow up ^ please?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HyukjinKwon commented on PR #42332:
URL: https://github.com/apache/spark/pull/42332#issuecomment-1664931905
cc @allisonwang-db @xinrong-meng @itholic
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
asl3 commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283948313
##
python/docs/source/getting_started/index.rst:
##
@@ -40,3 +40,4 @@ The list below is the contents of this quickstart page:
quickstart_df
quickstart_connect
7mming7 opened a new pull request, #42335:
URL: https://github.com/apache/spark/pull/42335
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
HyukjinKwon commented on code in PR #42332:
URL: https://github.com/apache/spark/pull/42332#discussion_r1283947706
##
python/pyspark/testing/pandasutils.py:
##
@@ -159,13 +160,26 @@ def _assert_pandas_almost_equal(
This function checks if given pandas objects approximately
HyukjinKwon commented on code in PR #42332:
URL: https://github.com/apache/spark/pull/42332#discussion_r1283947475
##
python/pyspark/testing/pandasutils.py:
##
@@ -159,13 +160,26 @@ def _assert_pandas_almost_equal(
This function checks if given pandas objects approximately
zhengruifeng opened a new pull request, #42334:
URL: https://github.com/apache/spark/pull/42334
### What changes were proposed in this pull request?
Uninstall large ML libraries for non-ML jobs
### Why are the changes needed?
ML is integrating external frameworks: torch,
zhengruifeng opened a new pull request, #42333:
URL: https://github.com/apache/spark/pull/42333
### What changes were proposed in this pull request?
Uninstall CodeQL/Go/Node in non-container jobs
### Why are the changes needed?
it can save 10G disk space
before this
yaooqinn commented on PR #42287:
URL: https://github.com/apache/spark/pull/42287#issuecomment-1664924280
cc @tgravescs @cloud-fan @HyukjinKwon, thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
cloud-fan closed pull request #42315: [SPARK-44653][SQL] Non-trivial DataFrame
unions should not break caching
URL: https://github.com/apache/spark/pull/42315
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
cloud-fan commented on PR #42315:
URL: https://github.com/apache/spark/pull/42315#issuecomment-1664918205
thanks for the review, merging to master/3.5/3.4!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
wangyum commented on code in PR #42315:
URL: https://github.com/apache/spark/pull/42315#discussion_r1283936632
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2272,9 +2316,7 @@ class Dataset[T] private[sql](
* @since 2.0.0
*/
def union(other:
zhengruifeng commented on PR #42118:
URL: https://github.com/apache/spark/pull/42118#issuecomment-1664917362
following tests are actually skipped:
```
Skipped tests in pyspark.ml.deepspeed.tests.test_deepspeed_distributor with
python3.9:
test_pytorch_file_e2e
cloud-fan commented on PR #42223:
URL: https://github.com/apache/spark/pull/42223#issuecomment-1664916455
For merging `func1(...) ... WHERE cond1` and `func2(...) ... WHERE cond2`,
we got
```
func1(...) FILTER cond1, func2(...) FILTER cond2 ... WHERE cond1 OR cond2
```
HyukjinKwon closed pull request #42282: [SPARK-44624][CONNECT] Retry
ExecutePlan in case initial request didn't reach server
URL: https://github.com/apache/spark/pull/42282
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
cloud-fan commented on code in PR #42315:
URL: https://github.com/apache/spark/pull/42315#discussion_r1283926955
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2272,9 +2316,7 @@ class Dataset[T] private[sql](
* @since 2.0.0
*/
def union(other:
HyukjinKwon closed pull request #42330: [SPARK-44664][PYTHON][CONNECT] Release
the execute when closing the iterator in Python client
URL: https://github.com/apache/spark/pull/42330
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
yaooqinn commented on PR #42295:
URL: https://github.com/apache/spark/pull/42295#issuecomment-1664906674
The staging directory is cleaned automatically by Spark, why do you even
need this hook?
--
This is an automated message from the Apache Git Service.
To respond to the message,
HyukjinKwon commented on PR #42282:
URL: https://github.com/apache/spark/pull/42282#issuecomment-1664906517
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon commented on PR #42330:
URL: https://github.com/apache/spark/pull/42330#issuecomment-1664906302
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
wangyum commented on code in PR #42315:
URL: https://github.com/apache/spark/pull/42315#discussion_r1283918802
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2272,9 +2316,7 @@ class Dataset[T] private[sql](
* @since 2.0.0
*/
def union(other:
beliefer commented on code in PR #42223:
URL: https://github.com/apache/spark/pull/42223#discussion_r1283905000
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CombineJoinedAggregates.scala:
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software
beliefer commented on code in PR #42223:
URL: https://github.com/apache/spark/pull/42223#discussion_r1283903546
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CombineJoinedAggregates.scala:
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software
asl3 commented on code in PR #42332:
URL: https://github.com/apache/spark/pull/42332#discussion_r1283898250
##
python/pyspark/sql/tests/test_utils.py:
##
@@ -746,28 +748,123 @@ def test_assert_unequal_null_expected(self):
)
def
beliefer commented on code in PR #42223:
URL: https://github.com/apache/spark/pull/42223#discussion_r1283895467
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/EliminateJoinByCombineAggregate.scala:
##
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache
zhengruifeng commented on PR #42253:
URL: https://github.com/apache/spark/pull/42253#issuecomment-1664861475
thanks, merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
zhengruifeng closed pull request #42253: [SPARK-44619][INFRA] Free up disk
space for container jobs
URL: https://github.com/apache/spark/pull/42253
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
ulysses-you commented on code in PR #42318:
URL: https://github.com/apache/spark/pull/42318#discussion_r1283886691
##
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:
##
@@ -371,49 +373,47 @@ trait FileSourceScanLike extends DataSourceScanExec {
HyukjinKwon closed pull request #42316: [SPARK-40770][PYTHON][FOLLOW-UP][3.5]
Improved error messages for mapInPandas for schema mismatch
URL: https://github.com/apache/spark/pull/42316
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
zhengruifeng commented on code in PR #42332:
URL: https://github.com/apache/spark/pull/42332#discussion_r1283884122
##
python/pyspark/sql/tests/test_utils.py:
##
@@ -746,28 +748,123 @@ def test_assert_unequal_null_expected(self):
)
def
HyukjinKwon commented on PR #42316:
URL: https://github.com/apache/spark/pull/42316#issuecomment-1664851259
Merged to branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #42268: [SPARK-43562][SPARK-43870][PS] Remove
APIs from `DataFrame` and `Series`
URL: https://github.com/apache/spark/pull/42268
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon commented on PR #42268:
URL: https://github.com/apache/spark/pull/42268#issuecomment-1664850264
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon closed pull request #42319: [SPARK-43873][PS] Enabling
`FrameDescribeTests`
URL: https://github.com/apache/spark/pull/42319
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HyukjinKwon commented on PR #42319:
URL: https://github.com/apache/spark/pull/42319#issuecomment-1664849527
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283881548
##
python/docs/source/getting_started/index.rst:
##
@@ -40,3 +40,4 @@ The list below is the contents of this quickstart page:
quickstart_df
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283881332
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283881012
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283880817
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283880695
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283880513
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283880415
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on code in PR #42284:
URL: https://github.com/apache/spark/pull/42284#discussion_r1283880241
##
python/docs/source/getting_started/testing_pyspark.ipynb:
##
@@ -0,0 +1,525 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id":
HyukjinKwon commented on PR #42302:
URL: https://github.com/apache/spark/pull/42302#issuecomment-1664841895
It has a conflict with branch-3.5. Should we create a PR for it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
HyukjinKwon closed pull request #42302: [SPARK-44640][PYTHON] Improve error
messages for Python UDTF returning non Iterable
URL: https://github.com/apache/spark/pull/42302
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
HyukjinKwon commented on PR #42302:
URL: https://github.com/apache/spark/pull/42302#issuecomment-1664841189
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
cloud-fan commented on code in PR #42315:
URL: https://github.com/apache/spark/pull/42315#discussion_r1283875186
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##
@@ -157,7 +157,7 @@ abstract class Optimizer(catalogManager:
LuciferYang commented on PR #42236:
URL: https://github.com/apache/spark/pull/42236#issuecomment-1664828251
While I'm not certain if it's reasonable, I still want to point out that
relocating the content of the `spark-protobuf` module may result to a poorer
user experience: In order to use
allisonwang-db commented on code in PR #42309:
URL: https://github.com/apache/spark/pull/42309#discussion_r1283860273
##
python/pyspark/cloudpickle/cloudpickle_fast.py:
##
@@ -631,7 +631,7 @@ def dump(self, obj):
try:
return Pickler.dump(self, obj)
LuciferYang commented on PR #42236:
URL: https://github.com/apache/spark/pull/42236#issuecomment-1664798658
> Would it be easier if we change maven to use the unshaded jar?
LuciferYang opened a new pull request, #41466:
URL: https://github.com/apache/spark/pull/41466
### What changes were proposed in this pull request?
There will be maven test failed of connect server module before this pr:
run
```
build/mvn clean install -DskipTests
LuciferYang commented on PR #41466:
URL: https://github.com/apache/spark/pull/41466#issuecomment-1664798514
re open
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
asl3 opened a new pull request, #42332:
URL: https://github.com/apache/spark/pull/42332
### What changes were proposed in this pull request?
This PR adds support for pandas DataFrame in `assertDataFrameEqual`, while
delaying all pandas imports until pandas environment dependency is
LuciferYang commented on code in PR #42236:
URL: https://github.com/apache/spark/pull/42236#discussion_r1283844577
##
connector/connect/server/src/test/resources/test.proto:
##
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ *
github-actions[bot] closed pull request #38171: [SPARK-9213] [SQL] Improve
regular expression performance (via joni)
URL: https://github.com/apache/spark/pull/38171
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
github-actions[bot] commented on PR #40629:
URL: https://github.com/apache/spark/pull/40629#issuecomment-1664792440
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #40665:
URL: https://github.com/apache/spark/pull/40665#issuecomment-1664792424
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #40918:
URL: https://github.com/apache/spark/pull/40918#issuecomment-1664792404
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #40929:
URL: https://github.com/apache/spark/pull/40929#issuecomment-1664792377
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #40930: [DO NOT MERGE] File constant
metadata extractors split
URL: https://github.com/apache/spark/pull/40930
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
HyukjinKwon commented on PR #42283:
URL: https://github.com/apache/spark/pull/42283#issuecomment-1664786685
@WweiL it has a conflict with branch-3.5. Mind resolving them and create a
PR please?
--
This is an automated message from the Apache Git Service.
To respond to the message,
juliuszsompolski commented on PR #42320:
URL: https://github.com/apache/spark/pull/42320#issuecomment-1664786638
Thank you @cdkrot . I continued working on it and incorporated it in
https://github.com/apache/spark/pull/42331. That should supersede this.
--
This is an automated message
HyukjinKwon closed pull request #42283:
[SPARK-44433][PYTHON][CONNECT][SS][FOLLOWUP] Terminate listener process with
`removeListener` and improvements
URL: https://github.com/apache/spark/pull/42283
--
This is an automated message from the Apache Git Service.
To respond to the message,
HyukjinKwon commented on PR #42283:
URL: https://github.com/apache/spark/pull/42283#issuecomment-1664785828
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
juliuszsompolski opened a new pull request, #42331:
URL: https://github.com/apache/spark/pull/42331
### What changes were proposed in this pull request?
This makes sure that all iterators used in Spark Connect scala client are
`CloseableIterator`.
1. Makes
HyukjinKwon commented on PR #42330:
URL: https://github.com/apache/spark/pull/42330#issuecomment-1664782620
cc @juliuszsompolski, @cdkrot, @zhengruifeng and @ueshin FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
HyukjinKwon opened a new pull request, #42330:
URL: https://github.com/apache/spark/pull/42330
### What changes were proposed in this pull request?
This PR implements the symmetry of
https://github.com/apache/spark/pull/42304 and
https://github.com/apache/spark/pull/42320.
srowen commented on PR #42322:
URL: https://github.com/apache/spark/pull/42322#issuecomment-1664773317
Merged to master/3.5
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
srowen closed pull request #42322: [MINOR][DOC] Fix a typo in
ResolveReferencesInUpdate scaladoc
URL: https://github.com/apache/spark/pull/42322
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
sdruzkin commented on PR #42322:
URL: https://github.com/apache/spark/pull/42322#issuecomment-1664771686
Tests are green.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon commented on PR #42320:
URL: https://github.com/apache/spark/pull/42320#issuecomment-1664769082
Merged https://github.com/apache/spark/pull/42304
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon closed pull request #42304: [SPARK-44642][CONNECT] ReleaseExecute
in ExecutePlanResponseReattachableIterator after it gets error from server
URL: https://github.com/apache/spark/pull/42304
--
This is an automated message from the Apache Git Service.
To respond to the message,
HyukjinKwon commented on PR #42304:
URL: https://github.com/apache/spark/pull/42304#issuecomment-1664768521
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon commented on PR #42314:
URL: https://github.com/apache/spark/pull/42314#issuecomment-1664766927
Merged to master and branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon closed pull request #42314: [SPARK-44652] Raise error when only one
df is None
URL: https://github.com/apache/spark/pull/42314
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
ueshin commented on code in PR #42283:
URL: https://github.com/apache/spark/pull/42283#discussion_r1283804782
##
core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala:
##
@@ -60,9 +69,9 @@ private[spark] class StreamingPythonRunner(func:
PythonFunction,
ueshin commented on code in PR #42283:
URL: https://github.com/apache/spark/pull/42283#discussion_r1283804782
##
core/src/main/scala/org/apache/spark/api/python/StreamingPythonRunner.scala:
##
@@ -60,9 +69,9 @@ private[spark] class StreamingPythonRunner(func:
PythonFunction,
dtenedor commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283804978
##
python/docs/source/user_guide/sql/python_udtf.rst:
##
@@ -0,0 +1,140 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor
dtenedor commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283804978
##
python/docs/source/user_guide/sql/python_udtf.rst:
##
@@ -0,0 +1,140 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor
allisonwang-db commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283803283
##
python/docs/source/user_guide/sql/python_udtf.rst:
##
@@ -0,0 +1,140 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more
allisonwang-db commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283802304
##
python/docs/source/user_guide/sql/python_udtf.rst:
##
@@ -0,0 +1,140 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+or more
ueshin commented on code in PR #42302:
URL: https://github.com/apache/spark/pull/42302#discussion_r1283569195
##
python/pyspark/worker.py:
##
@@ -599,7 +600,7 @@ def verify_result(result):
raise PySparkTypeError(
allisonwang-db commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283791586
##
examples/src/main/python/sql/udtf.py:
##
@@ -0,0 +1,169 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
allisonwang-db commented on code in PR #42272:
URL: https://github.com/apache/spark/pull/42272#discussion_r1283790876
##
examples/src/main/python/sql/udtf.py:
##
@@ -0,0 +1,169 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
ueshin commented on PR #42328:
URL: https://github.com/apache/spark/pull/42328#issuecomment-1664724038
cc @allisonwang-db @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
allisonwang-db opened a new pull request, #42329:
URL: https://github.com/apache/spark/pull/42329
### What changes were proposed in this pull request?
This PR disables arrow optimization by default for Python UDTFs.
### Why are the changes needed?
To make Python
1 - 100 of 231 matches
Mail list logo