[ https://issues.apache.org/jira/browse/SPARK-48087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-48087. ---------------------------------- Fix Version/s: 3.5.2 Resolution: Fixed Issue resolved by pull request 46473 [https://github.com/apache/spark/pull/46473] > Python UDTF incompatibility in 3.5 client <> 4.0 server > ------------------------------------------------------- > > Key: SPARK-48087 > URL: https://issues.apache.org/jira/browse/SPARK-48087 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark > Affects Versions: 4.0.0 > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Major > Labels: pull-request-available > Fix For: 3.5.2 > > > {code} > ====================================================================== > FAIL [0.103s]: test_udtf_init_with_additional_args > (pyspark.sql.tests.connect.test_parity_udtf.ArrowUDTFParityTests.test_udtf_init_with_additional_args) > ---------------------------------------------------------------------- > pyspark.errors.exceptions.connect.PythonException: > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1816, in main > func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, > eval_type) > self._check_result_or_exception(TestUDTF, ret_type, expected) > File > "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", > line 598, in _check_result_or_exception > with self.assertRaisesRegex(err_type, expected): > AssertionError: "AttributeError" does not match " > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1834, in main > process() > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1826, in process > serializer.dump_stream(out_iter, outfile) > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", > line 224, in dump_stream > self.serializer.dump_stream(self._batched(iterator), stream) > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", > line 145, in dump_stream > for obj in iterator: > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py", > line 213, in _batched > for item in iterator: > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1391, in mapper > yield eval(*[a[o] for o in args_kwargs_offsets]) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1371, in evaluate > return tuple(map(verify_and_convert_result, res)) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1340, in verify_and_convert_result > return toInternal(result) > ^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", > line 1291, in toInternal > return tuple( > ^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", > line 1292, in <genexpr> > f.toInternal(v) if c else v > ^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", > line 907, in toInternal > return self.dataType.toInternal(obj) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py", > line 372, in toInternal > calendar.timegm(dt.utctimetuple()) if dt.tzinfo else > time.mktime(dt.timetuple()) > ..." > {code} > {code} > ====================================================================== > FAIL [0.096s]: test_udtf_init_with_additional_args > (pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_init_with_additional_args) > ---------------------------------------------------------------------- > pyspark.errors.exceptions.connect.PythonException: > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1816, in main > func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, > eval_type) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 946, in read_udtf > raise PySparkRuntimeError( > pyspark.errors.exceptions.base.PySparkRuntimeError: > [UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the > user-defined table function 'TestUDTF' because its constructor is invalid: > the function does not implement the 'analyze' method, and its constructor has > more than one argument (including the 'self' reference). Please update the > table function so that its constructor accepts exactly one 'self' argument, > and try the query again. > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File > "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", > line 274, in test_udtf_init_with_additional_args > with self.assertRaisesRegex( > AssertionError: "__init__\(\) missing 1 required positional argument: 'a'" > does not match " > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1816, in main > func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, > eval_type) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 946, in read_udtf > raise PySparkRuntimeError( > pyspark.errors.exceptions.base.PySparkRuntimeError: > [UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the > user-defined table function 'TestUDTF' because its constructor is invalid: > the function does not implement the 'analyze' method, and its constructor has > more than one argument (including the 'self' reference). Please update the > table function so that its constructor accepts exactly one 'self' argument, > and try the query again. > " > {code} > {code} > ====================================================================== > FAIL [0.087s]: test_udtf_with_wrong_num_input > (pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_with_wrong_num_input) > ---------------------------------------------------------------------- > pyspark.errors.exceptions.connect.PythonException: > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1816, in main > func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, > eval_type) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1082, in read_udtf > raise PySparkRuntimeError( > pyspark.errors.exceptions.base.PySparkRuntimeError: > [UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the > user-defined table function 'TestUDTF' because the function arguments did not > match the expected signature of the 'eval' method (missing a required > argument: 'a'). Please update the query so that this table function call > provides arguments matching the expected signature, or else update the table > function so that its 'eval' method accepts the provided arguments, and then > try the query again. > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File > "/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", > line 255, in test_udtf_with_wrong_num_input > with self.assertRaisesRegex( > AssertionError: "eval\(\) missing 1 required positional argument: 'a'" does > not match " > An exception was thrown from the Python worker. Please see the stack trace > below. > Traceback (most recent call last): > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1816, in main > func, profiler, deserializer, serializer = read_udtf(pickleSer, infile, > eval_type) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File > "/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 1082, in read_udtf > raise PySparkRuntimeError( > pyspark.errors.exceptions.base.PySparkRuntimeError: > [UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the > user-defined table function 'TestUDTF' because the function arguments did not > match the expected signature of the 'eval' method (missing a required > argument: 'a'). Please update the query so that this table function call > provides arguments matching the expected signature, or else update the table > function so that its 'eval' method accepts the provided arguments, and then > try the query again. > " > ---------------------------------------------------------------------- > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org