chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2138544456
Yes, but these imports are wrapped in if not is_remote_only(), so spark
connect test should already skip these import statement. This is weird, I will
take another look.
--
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2138460399
```
from pyspark.core.rdd import RDD, RDDBarrier
from pyspark.core.files import SparkFiles
from pyspark.core.status import StatusTracker, SparkJobInfo,
chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2138108922
This import statement is supposed to be skipped for spark connect test
https://github.com/apache/spark/blob/master/python/pyspark/__init__.py#L55
Is the is_remote_only()
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2123879472
@chaoqin-li1123 gentle ping on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2099809962
Should follow
https://github.com/apache/spark/blob/master/.github/workflows/build_python_connect.yml#L69-L94
steps, and try to reproduce why it fails. Spark Connect doesn't need to
chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2099774647
This seems to be broken in the main function of pyspark init(), what is the
expected action item we should take? @HyukjinKwon
--
This is an automated message from the Apache
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2097190957
follow
https://github.com/apache/spark/blob/master/.github/workflows/build_python_connect.yml#L80-L113
to reproduce the failure
--
This is an automated message from the Apache Git
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2097139796
py4j shouldn't be referred for connect test. can we move them, and import
when it's actually used?
--
This is an automated message from the Apache Git Service.
To respond to the
chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095178573
@HyukjinKwon both test_python_datasource, test_python_streaming_datasource
will fail with the same error if py4j*.zip is removed.
> Traceback (most recent call last):
>
dongjoon-hyun commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095027198
Thank you for checking and mitigating this by reverting, @HyukjinKwon .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
HyukjinKwon commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2095022472
@chaoqin-li1123 Seems like this test does not work with pure Python library.
Can you see if the tests pass after removing `python/lib/py4j*.zip`?
Let me revert this for now
dongjoon-hyun closed pull request #45950: [SPARK-4][PYTHON][SS][TESTS] Add
spark connect test for python streaming data source
URL: https://github.com/apache/spark/pull/45950
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
chaoqin-li1123 commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2046225155
cc @allisonwang-db @HeartSaVioR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
xinrong-meng commented on PR #45950:
URL: https://github.com/apache/spark/pull/45950#issuecomment-2045843912
LGTM once CI pass, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
chaoqin-li1123 commented on code in PR #45950:
URL: https://github.com/apache/spark/pull/45950#discussion_r1558088128
##
python/pyspark/sql/tests/connect/test_parity_python_streaming_datasource.py:
##
@@ -0,0 +1,35 @@
+#
Review Comment:
Added.
--
This is an automated
15 matches
Mail list logo