emkornfield commented on a change in pull request #4021:
URL: https://github.com/apache/iceberg/pull/4021#discussion_r800126966



##########
File path: python/src/iceberg/io/base.py
##########
@@ -24,7 +24,40 @@
 """
 
 from abc import ABC, abstractmethod
-from typing import Union
+from typing import Protocol, Union, runtime_checkable
+
+
+@runtime_checkable
+class InputStream(Protocol):
+    def read(self, n: int) -> bytes:
+        ...

Review comment:
       So arrow should be [smart 
enough](https://arrow.apache.org/docs/python/parquet.html#reading-and-writing-single-files)
 to distinguish between NativeFile and IOBase and it appears NativeFile is 
protocol compatible with IOBase.  The issues I see with the UnionType are:
   1.  It appears IO is not a protocol, so I think type checks would fail 
unless consumers specifically extended it (I haven't done a lot of development 
with type hinting in python though so I'm not sure).
   2.  It seems like it takes a hard dependency on pyarrow, which in previous 
discussions we wanted to avoid. 
   
   Given this I still think using a protocol that mimics IO[bytes] seems like 
what we would want and for an arrow specific implementation should avoid any 
performance loss.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to