Fokko commented on code in PR #1568:
URL: https://github.com/apache/iceberg-python/pull/1568#discussion_r1928631882
##########
pyiceberg/io/fsspec.py:
##########
@@ -150,6 +151,9 @@ def _s3(properties: Properties) -> AbstractFileSystem:
if connect_timeout := properties.get(S3_CONNECT_TIMEOUT):
config_kwargs["connect_timeout"] = float(connect_timeout)
+ if request_timeout := properties.get(S3_REQUEST_TIMEOUT):
+ config_kwargs["request_timeout"] = float(request_timeout)
Review Comment:
I think this should be `read_timeout`, looking at:
https://github.com/fsspec/s3fs/blob/51e3c80ef380a82081a171de652e2b699753be2b/s3fs/core.py#L473-L479
##########
pyiceberg/io/pyarrow.py:
##########
@@ -438,6 +442,9 @@ def _initialize_s3_fs(self, netloc: Optional[str]) ->
FileSystem:
if connect_timeout := self.properties.get(S3_CONNECT_TIMEOUT):
client_kwargs["connect_timeout"] = float(connect_timeout)
+ if request_timeout := self.properties.get(S3_REQUEST_TIMEOUT):
+ client_kwargs["request_timeout"] = float(request_timeout)
Review Comment:
Looks good:
https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html

##########
mkdocs/docs/configuration.md:
##########
@@ -116,6 +116,7 @@ For the FileIO there are several configuration options
available:
| s3.region | us-west-2 | Configure the default
region used to initialize an `S3FileSystem`. `PyArrowFileIO` attempts to
automatically resolve the region for each S3 bucket, falling back to this value
if resolution fails.
|
| s3.proxy-uri | <http://my.proxy.com:8080> | Configure the proxy
server to be used by the FileIO.
|
| s3.connect-timeout | 60.0 | Configure socket
connection timeout, in seconds.
|
+| s3.request-timeout | 60.0 | Configure socket read
timeouts on Windows and macOS, in seconds.
|
Review Comment:
I couldn't find a Java equivalent, so I'm fine with introducing this one 👍
##########
pyiceberg/io/pyarrow.py:
##########
@@ -394,6 +395,9 @@ def _initialize_oss_fs(self) -> FileSystem:
if connect_timeout := self.properties.get(S3_CONNECT_TIMEOUT):
client_kwargs["connect_timeout"] = float(connect_timeout)
+ if request_timeout := self.properties.get(S3_REQUEST_TIMEOUT):
+ client_kwargs["request_timeout"] = float(request_timeout)
Review Comment:
Looks good:
https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]