[jira] [Updated] (SPARK-48090) Streaming exception catch failure in 3.5 client <> 4.0 server

2024-05-06 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-48090:
-
Description: 
{code}
==
FAIL [1.975s]: test_stream_exception 
(pyspark.sql.tests.connect.streaming.test_parity_streaming.StreamingParityTests.test_stream_exception)
--
Traceback (most recent call last):
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/streaming/test_streaming.py",
 line 287, in test_stream_exception
sq.processAllAvailable()
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/streaming/query.py",
 line 129, in processAllAvailable
self._execute_streaming_query_cmd(cmd)
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/streaming/query.py",
 line 177, in _execute_streaming_query_cmd
(_, properties) = self._session.client.execute_command(exec_cmd)
  ^^
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py", 
line 982, in execute_command
data, _, _, _, properties = self._execute_and_fetch(req)

  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py", 
line 1283, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(req):
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py", 
line 1264, in _execute_and_fetch_as_iterator
self._handle_error(error)
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py", 
line 1503, in _handle_error
self._handle_rpc_error(error)
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py", 
line 1539, in _handle_rpc_error
raise convert_exception(info, status.message) from None
pyspark.errors.exceptions.connect.StreamingQueryException: [STREAM_FAILED] 
Query [id = 38d0d145-1f57-4b92-b317-d9de727d9468, runId = 
2b963119-d391-4c62-abea-970274859b80] terminated with exception: Job aborted 
due to stage failure: Task 0 in stage 79.0 failed 1 times, most recent failure: 
Lost task 0.0 in stage 79.0 (TID 116) 
(fv-az1144-341.tm43j05r3bqe3lauap1nzddazg.ex.internal.cloudapp.net executor 
driver): org.apache.spark.api.python.PythonException: Traceback (most recent 
call last):
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 
1834, in main
process()
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 
1826, in process
serializer.dump_stream(out_iter, outfile)
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", 
line 224, in dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", 
line 145, in dump_stream
for obj in iterator:
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", 
line 213, in _batched
for item in iterator:
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 
1734, in mapper
result = tuple(f(*[a[o] for o in arg_offsets]) for arg_offsets, f in udfs)
 ^
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 
1734, in 
result = tuple(f(*[a[o] for o in arg_offsets]) for arg_offsets, f in udfs)
   ^^^
  File 
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line 
112, in 
return args_kwargs_offsets, lambda *a: func(*a)
   
  File "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/util.py", 
line 118, in wrapper
return f(*args, **kwargs)
   ^^
  File "/home/runner/work/spark/spark-3
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/streaming/test_streaming.py",
 line 291, in test_stream_exception
self._assert_exception_tree_contains_msg(e, "ZeroDivisionError")
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/streaming/test_streaming.py",
 line 300, in _assert_exception_tree_contains_msg
self._assert_exception_tree_contains_msg_connect(exception, msg)
  File 
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/streaming/test_streaming.py",
 line 305, in _assert_exception_tree_contains_msg_connect
self.assertTrue(
AssertionError: False is not true : Exception tree doesn't contain the expected 
message: ZeroDivisionError

[jira] [Updated] (SPARK-48090) Streaming exception catch failure in 3.5 client <> 4.0 server

2024-05-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-48090:
---
Labels: pull-request-available  (was: )

> Streaming exception catch failure in 3.5 client <> 4.0 server
> -
>
> Key: SPARK-48090
> URL: https://issues.apache.org/jira/browse/SPARK-48090
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> ==
> FAIL [1.975s]: test_stream_exception 
> (pyspark.sql.tests.connect.streaming.test_parity_streaming.StreamingParityTests.test_stream_exception)
> --
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/streaming/test_streaming.py",
>  line 287, in test_stream_exception
> sq.processAllAvailable()
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/streaming/query.py",
>  line 129, in processAllAvailable
> self._execute_streaming_query_cmd(cmd)
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/streaming/query.py",
>  line 177, in _execute_streaming_query_cmd
> (_, properties) = self._session.client.execute_command(exec_cmd)
>   ^^
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py",
>  line 982, in execute_command
> data, _, _, _, properties = self._execute_and_fetch(req)
> 
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py",
>  line 1283, in _execute_and_fetch
> for response in self._execute_and_fetch_as_iterator(req):
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py",
>  line 1264, in _execute_and_fetch_as_iterator
> self._handle_error(error)
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py",
>  line 
> [150](https://github.com/HyukjinKwon/spark/actions/runs/8907172876/job/24460568471#step:9:151)3,
>  in _handle_error
> self._handle_rpc_error(error)
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/connect/client/core.py",
>  line 1539, in _handle_rpc_error
> raise convert_exception(info, status.message) from None
> pyspark.errors.exceptions.connect.StreamingQueryException: [STREAM_FAILED] 
> Query [id = 38d0d145-1f57-4b92-b317-d9de727d9468, runId = 
> 2b963119-d391-4c62-abea-970274859b80] terminated with exception: Job aborted 
> due to stage failure: Task 0 in stage 79.0 failed 1 times, most recent 
> failure: Lost task 0.0 in stage 79.0 (TID 116) 
> (fv-az1144-341.tm43j05r3bqe3lauap1nzddazg.ex.internal.cloudapp.net executor 
> driver): org.apache.spark.api.python.PythonException: Traceback (most recent 
> call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1834, in main
> process()
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1826, in process
> serializer.dump_stream(out_iter, outfile)
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 224, in dump_stream
> self.serializer.dump_stream(self._batched(iterator), stream)
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 145, in dump_stream
> for obj in iterator:
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 213, in _batched
> for item in iterator:
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1734, in mapper
> result = tuple(f(*[a[o] for o in arg_offsets]) for arg_offsets, f in udfs)
>  ^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1734, in 
> result = tuple(f(*[a[o] for o in arg_offsets]) for arg_offsets, f in udfs)
>^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 112, in 
> return args_kwargs_offsets, lambda *a: func(*a)
>
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/util.py", line 
> 118, in wrapper
> return f(*args, **kwargs)
>^^
>   File "/home/runner/work/spark/spark-3
> During handling of