sven-weber-db commented on code in PR #55716:
URL: https://github.com/apache/spark/pull/55716#discussion_r3228222363
##########
python/pyspark/worker.py:
##########
@@ -3602,6 +3615,10 @@ def main(infile, outfile):
runner_conf = RunnerConf(init_info.runner_conf)
eval_conf = EvalConf(init_info.eval_conf)
if eval_type == PythonEvalType.NON_UDF:
+ # The type checker needs some help here..
+ # See the code in WorkerInitInfo.from_stream(infile)
+ # to see the correct type.
+ assert isinstance(init_info.udf_info, memoryview)
Review Comment:
While `typing.cast(memoryview, init_info.udf_info)` would help `mypy`, it
also means that `init_info.udf_info` is not guaranteed to be of type
`memoryview`. If someone were to change the code in
`WorkerInitInfo.from_stream` to return a different type for the `NON_UDF` case,
this issue would not be caught and might lead to subtle run-time errors. My
rationale for asserting here was to prevent such a situation.
However, I agree that an `assert` is probably not the solution here. I
revisited the problem, and I think I have found a nice solution using a
`TypeGuard` instead. In `worker_message.py`, the following `TypeGuard` has been
added:
```python
def is_non_udf_info(
udf_info: UdfInfoType,
eval_type: int,
) -> TypeGuard[memoryview]:
"""TypeGuard that narrows udf_info to memoryview when eval_type is
NON_UDF."""
return eval_type == PythonEvalType.NON_UDF
```
This can be used in `worker.py` instead of the `if eval_type ==
PythonEvalType.NON_UDF:`, which resolves the typing error. Additionally, it
keeps the typing logic close to the source where `udf_info` is defined. Let's
discuss if you prefer a different solution!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]