jiakai-li commented on code in PR #1453:
URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1903135543
##########
pyiceberg/io/pyarrow.py:
##########
@@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]:
def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) ->
FileSystem:
if scheme in {"s3", "s3a", "s3n", "oss"}:
- from pyarrow.fs import S3FileSystem
+ from pyarrow.fs import S3FileSystem, resolve_s3_region
Review Comment:
> I dont think its supported, the underlying call is looking for
`x-amz-bucket-region` which i dont think aliyun will set
Thank you for checking that, I should have looked at it :-)
> since we're using both scheme and bucket to cache FS, this should be fine
right? For the case of `oss://bucket` and `s3://bucket`
Yes, there is no issue after the change now. What I was thinking is for the
`oss://bucket` scenario (ignore the caching behavior). If the bucket used by
`oss` also exists in AWS, then the previous version (before your comment) would
try to resolve the bucket and use it to overwrite the defaul setting. This will
be wrong though, because `oss` bucket region cannot be resolved using pyarrow.
I updated the test case to take this into account and also added an
integration test for multiple filesystem read.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]