Yicong-Huang commented on code in PR #55532:
URL: https://github.com/apache/spark/pull/55532#discussion_r3157516808
##########
python/pyspark/worker.py:
##########
@@ -234,46 +248,56 @@ def chain(f, g):
return lambda *a: g(f(*a))
-def verify_result(expected_type: type) -> Callable[[Any], Iterator]:
- """
- Create a result verifier that checks both iterability and element types.
+@overload
+def verify_return_type(result: Any, expected_type: Type[T]) -> T: ...
- Returns a function that takes a UDF result, verifies it is iterable,
- and lazily type-checks each element via map.
- Parameters
- ----------
- expected_type : type
- The expected Python/PyArrow type for each element
- (e.g. pa.RecordBatch, pa.Array).
+@overload
+def verify_return_type(result: Any, expected_type: Any) -> Any: ...
+
+
+def verify_return_type(result: Any, expected_type: Any) -> Any:
"""
+ Verify a UDF return value against an expected type.
- package = getattr(inspect.getmodule(expected_type), "__package__", "")
- label: str = f"{package}.{expected_type.__name__}"
+ Returns ``result`` unchanged if ``isinstance(result, expected_type)``.
+ For ``Iterator[T]``, returns a lazy iterator that checks each element
+ against ``T`` on consumption. Raises ``PySparkTypeError`` on mismatch.
+ """
+ if getattr(expected_type, "_name", None) == "Iterator":
Review Comment:
Switched to `get_origin(expected_type) is collections.abc.Iterator` in
6f2967b — matches both `typing.Iterator[T]` and the PEP 585 form.
##########
python/pyspark/worker.py:
##########
@@ -234,46 +251,56 @@ def chain(f, g):
return lambda *a: g(f(*a))
-def verify_result(expected_type: type) -> Callable[[Any], Iterator]:
- """
- Create a result verifier that checks both iterability and element types.
+@overload
Review Comment:
Tried — mypy rejects `Type[Iterator[T]]` with `type-abstract` since Iterator
is an ABC. Same root cause as the comment under #259; the `Any` overload is the
workaround.
##########
python/pyspark/worker.py:
##########
@@ -234,46 +251,56 @@ def chain(f, g):
return lambda *a: g(f(*a))
-def verify_result(expected_type: type) -> Callable[[Any], Iterator]:
- """
- Create a result verifier that checks both iterability and element types.
+@overload
+def verify_return_type(result: Any, expected_type: Type[T]) -> T: ...
- Returns a function that takes a UDF result, verifies it is iterable,
- and lazily type-checks each element via map.
- Parameters
- ----------
- expected_type : type
- The expected Python/PyArrow type for each element
- (e.g. pa.RecordBatch, pa.Array).
+@overload
+def verify_return_type(result: Any, expected_type: Any) -> Any: ...
Review Comment:
Passing `Iterator[X]` against a `Type[T]` parameter trips mypy's
`type-abstract` rule (Iterator is an ABC). The `Any` overload is the bypass for
the iterator callers — happy to drop it once we find an mypy-accepted form.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]