juliuszsompolski commented on PR #42908: URL: https://github.com/apache/spark/pull/42908#issuecomment-1721459768
@dongjoon-hyun I don't think the SparkConnectSessionHolderSuite failures are related, and I don't know what's going on there. ``` Streaming foreachBatch worker is starting with url sc://localhost:15002/;user_id=testUser and sessionId 9863bb98-6682-43ad-bc86-b32d8486fb47. Traceback (most recent call last): File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/pandas/utils.py", line 27, in require_minimum_pandas_version import pandas ModuleNotFoundError: No module named 'pandas' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py", line 86, in <module> main(sock_file, sock_file) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py", line 51, in main spark_connect_session = SparkSession.builder.remote(connect_url).getOrCreate() File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/session.py", line 464, in getOrCreate from pyspark.sql.connect.session import SparkSession as RemoteSparkSession File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/session.py", line 19, in <module> check_dependencies(__name__) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/utils.py", line 33, in check_dependencies require_minimum_pandas_version() File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/pandas/utils.py", line 34, in require_minimum_pandas_version raise ImportError( ImportError: Pandas >= 1.0.5 must be installed; however, it was not found. [info] - python foreachBatch process: process terminates after query is stopped *** FAILED *** (1 second, 115 milliseconds) Streaming query listener worker is starting with url sc://localhost:15002/;user_id=testUser and sessionId ab6cfcde-a9f1-4b96-8ca3-7aab5c6ff438. Traceback (most recent call last): File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/pandas/utils.py", line 27, in require_minimum_pandas_version import pandas ModuleNotFoundError: No module named 'pandas' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/streaming/worker/listener_worker.py", line 99, in <module> main(sock_file, sock_file) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/streaming/worker/listener_worker.py", line 59, in main spark_connect_session = SparkSession.builder.remote(connect_url).getOrCreate() File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/session.py", line 464, in getOrCreate from pyspark.sql.connect.session import SparkSession as RemoteSparkSession File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/session.py", line 19, in <module> check_dependencies(__name__) File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/connect/utils.py", line 33, in check_dependencies require_minimum_pandas_version() File "/home/runner/work/apache-spark/apache-spark/python/pyspark/sql/pandas/utils.py", line 34, in require_minimum_pandas_version raise ImportError( ImportError: Pandas >= 1.0.5 must be installed; however, it was not found. [info] - python listener process: process terminates after listener is removed *** FAILED *** (434 milliseconds) [info] java.io.EOFException: ``` it looks to me like some (intermittent?) environment issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org