[jira] [Resolved] (SPARK-48087) Python UDTF incompatibility in 3.5 client <> 4.0 server

Hyukjin Kwon (Jira) Wed, 08 May 2024 04:01:05 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-48087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-48087.
----------------------------------
    Fix Version/s: 3.5.2
       Resolution: Fixed

Issue resolved by pull request 46473
[https://github.com/apache/spark/pull/46473]

> Python UDTF incompatibility in 3.5 client <> 4.0 server
> -------------------------------------------------------
>
>                 Key: SPARK-48087
>                 URL: https://issues.apache.org/jira/browse/SPARK-48087
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect, PySpark
>    Affects Versions: 4.0.0
>            Reporter: Hyukjin Kwon
>            Assignee: Hyukjin Kwon
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.5.2
>
>
> {code}
> ======================================================================
> FAIL [0.103s]: test_udtf_init_with_additional_args 
> (pyspark.sql.tests.connect.test_parity_udtf.ArrowUDTFParityTests.test_udtf_init_with_additional_args)
> ----------------------------------------------------------------------
> pyspark.errors.exceptions.connect.PythonException: 
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1816, in main
>     func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, 
> eval_type)
>     self._check_result_or_exception(TestUDTF, ret_type, expected)
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", 
> line 598, in _check_result_or_exception
>     with self.assertRaisesRegex(err_type, expected):
> AssertionError: "AttributeError" does not match "
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1834, in main
>     process()
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1826, in process
>     serializer.dump_stream(out_iter, outfile)
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 224, in dump_stream
>     self.serializer.dump_stream(self._batched(iterator), stream)
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 145, in dump_stream
>     for obj in iterator:
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 213, in _batched
>     for item in iterator:
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1391, in mapper
>     yield eval(*[a[o] for o in args_kwargs_offsets])
>           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1371, in evaluate
>     return tuple(map(verify_and_convert_result, res))
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1340, in verify_and_convert_result
>     return toInternal(result)
>            ^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", 
> line 1291, in toInternal
>     return tuple(
>            ^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", 
> line 1292, in <genexpr>
>     f.toInternal(v) if c else v
>     ^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", 
> line 907, in toInternal
>     return self.dataType.toInternal(obj)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", 
> line 372, in toInternal
>     calendar.timegm(dt.utctimetuple()) if dt.tzinfo else 
> time.mktime(dt.timetuple())
>             ..."
> {code}
> {code}
> ======================================================================
> FAIL [0.096s]: test_udtf_init_with_additional_args 
> (pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_init_with_additional_args)
> ----------------------------------------------------------------------
> pyspark.errors.exceptions.connect.PythonException: 
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1816, in main
>     func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, 
> eval_type)
>                                                
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 946, in read_udtf
>     raise PySparkRuntimeError(
> pyspark.errors.exceptions.base.PySparkRuntimeError: 
> [UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the 
> user-defined table function 'TestUDTF' because its constructor is invalid: 
> the function does not implement the 'analyze' method, and its constructor has 
> more than one argument (including the 'self' reference). Please update the 
> table function so that its constructor accepts exactly one 'self' argument, 
> and try the query again.
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", 
> line 274, in test_udtf_init_with_additional_args
>     with self.assertRaisesRegex(
> AssertionError: "__init__\(\) missing 1 required positional argument: 'a'" 
> does not match "
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1816, in main
>     func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, 
> eval_type)
>                                                
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 946, in read_udtf
>     raise PySparkRuntimeError(
> pyspark.errors.exceptions.base.PySparkRuntimeError: 
> [UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the 
> user-defined table function 'TestUDTF' because its constructor is invalid: 
> the function does not implement the 'analyze' method, and its constructor has 
> more than one argument (including the 'self' reference). Please update the 
> table function so that its constructor accepts exactly one 'self' argument, 
> and try the query again.
> "
> {code}
> {code}
> ======================================================================
> FAIL [0.087s]: test_udtf_with_wrong_num_input 
> (pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_with_wrong_num_input)
> ----------------------------------------------------------------------
> pyspark.errors.exceptions.connect.PythonException: 
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1816, in main
>     func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, 
> eval_type)
>                                                
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1082, in read_udtf
>     raise PySparkRuntimeError(
> pyspark.errors.exceptions.base.PySparkRuntimeError: 
> [UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the 
> user-defined table function 'TestUDTF' because the function arguments did not 
> match the expected signature of the 'eval' method (missing a required 
> argument: 'a'). Please update the query so that this table function call 
> provides arguments matching the expected signature, or else update the table 
> function so that its 'eval' method accepts the provided arguments, and then 
> try the query again.
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", 
> line 255, in test_udtf_with_wrong_num_input
>     with self.assertRaisesRegex(
> AssertionError: "eval\(\) missing 1 required positional argument: 'a'" does 
> not match "
>   An exception was thrown from the Python worker. Please see the stack trace 
> below.
> Traceback (most recent call last):
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1816, in main
>     func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, 
> eval_type)
>                                                
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", 
> line 1082, in read_udtf
>     raise PySparkRuntimeError(
> pyspark.errors.exceptions.base.PySparkRuntimeError: 
> [UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the 
> user-defined table function 'TestUDTF' because the function arguments did not 
> match the expected signature of the 'eval' method (missing a required 
> argument: 'a'). Please update the query so that this table function call 
> provides arguments matching the expected signature, or else update the table 
> function so that its 'eval' method accepts the provided arguments, and then 
> try the query again.
> "
> ----------------------------------------------------------------------
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-48087) Python UDTF incompatibility in 3.5 client <> 4.0 server

Reply via email to