[ https://issues.apache.org/jira/browse/HADOOP-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774923#comment-17774923 ]
ASF GitHub Bot commented on HADOOP-18908: ----------------------------------------- ahmarsuhail commented on code in PR #6106: URL: https://github.com/apache/hadoop/pull/6106#discussion_r1358340239 ########## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java: ########## @@ -229,4 +254,49 @@ private static URI getS3Endpoint(String endpoint, final Configuration conf) { throw new IllegalArgumentException(e); } } + + /** + * Parses the endpoint to get the region. + * If endpoint is the central one, use US_EAST_1. + * + * @param endpoint the configure endpoint. + * @return the S3 region. + */ + private static Region getS3RegionFromEndpoint(String endpoint) { + + if(!endpoint.endsWith(CENTRAL_ENDPOINT)) { + LOG.debug("Endpoint {} is not the default; parsing", endpoint); + return AwsHostNameUtils.parseSigningRegion(endpoint, S3_SERVICE_NAME).orElse(null); Review Comment: turns out this `AwsHostNameUtils.parseSigningRegion` is not mean to be used for non standard endpoints .. which why a vpce endpoint gets resolved to region "vpce". Can we still merge this PR and create a follow up PR to handle non standard endpoints. I working with the SDK team to understand how best to do this - via the SDK or in S3A. > Improve s3a region handling, including determining from endpoint > ---------------------------------------------------------------- > > Key: HADOOP-18908 > URL: https://issues.apache.org/jira/browse/HADOOP-18908 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Assignee: Ahmar Suhail > Priority: Major > Labels: pull-request-available > > s3a now requires the fs.s3a.endpoint.region to be set; and while it can > determine it from a network call, this takes time and doesn't work for third > party stores. > proposed > * reinstate parsing of the fs.3a.endpoint url to automatically determine > region from well known endoints (and vplink ones) > * don't try to talk to AWS S3 if endpoint isn't an aws one: for that caller > must declare (HADOOP-18673) > * document this in v2 migration, including stack traces of falures -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org