sven-weber-db commented on code in PR #55716:
URL: https://github.com/apache/spark/pull/55716#discussion_r3281993975
##########
python/pyspark/worker_message.py:
##########
@@ -46,7 +47,7 @@ class ResourceInfo:
local_properties: dict[str, str]
@classmethod
- def from_stream(cls, stream: IO) -> "TaskContextInfo":
+ def from_stream(cls, stream: ZeroCopyByteStream) -> "TaskContextInfo":
Review Comment:
I think the `memoryview` and `bytes` problem is very manageable, as we can
support both without much trouble. From my current understanding, if an API
supports `memoryview`, it will also support `bytes`.
The difference between the `ZeroCopyByteStream` and `IO` is a bit more
involved, I would say. ` ZeroCopyByteStream` is somewhat IO as it supports a
`read(size)` interface. However, this interface does not quite match
`typing.BinaryIO`. E.g., `ZeroCopyByteStream` does not support unbounded reads
on purpose (recall [this discussion in the origin
PR](https://github.com/apache/spark/pull/55515#discussion_r3170219262)). In my
view, the purpose of the typing information is to inform the reader/developer
about the API expected for this object. There are some subtle differences
between `ZeroCopyByteStream` and `BinaryIO` that we should make people aware of
here.
I am happy to support `Union[ZeroCopyByteStream, IO]`, but only generalizing
the type to `IO` would not be correct. Therefore, I would argue that the
`ZeroCopyByteStream` typing is OK for now. If we need support for another type
of stream in the future, the typing can easily be extended with a `Union`.
Preemptively adding additional types that are currently not needed is probably
not a good idea.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]