[ 
https://issues.apache.org/jira/browse/HADOOP-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774923#comment-17774923
 ] 

ASF GitHub Bot commented on HADOOP-18908:
-----------------------------------------

ahmarsuhail commented on code in PR #6106:
URL: https://github.com/apache/hadoop/pull/6106#discussion_r1358340239


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:
##########
@@ -229,4 +254,49 @@ private static URI getS3Endpoint(String endpoint, final 
Configuration conf) {
       throw new IllegalArgumentException(e);
     }
   }
+
+  /**
+   * Parses the endpoint to get the region.
+   * If endpoint is the central one, use US_EAST_1.
+   *
+   * @param endpoint the configure endpoint.
+   * @return the S3 region.
+   */
+  private static Region getS3RegionFromEndpoint(String endpoint) {
+
+    if(!endpoint.endsWith(CENTRAL_ENDPOINT)) {
+      LOG.debug("Endpoint {} is not the default; parsing", endpoint);
+      return AwsHostNameUtils.parseSigningRegion(endpoint, 
S3_SERVICE_NAME).orElse(null);

Review Comment:
   turns out this `AwsHostNameUtils.parseSigningRegion` is not mean to be used 
for non standard endpoints .. which why a vpce endpoint gets resolved to region 
"vpce". Can we still merge this PR and create a follow up PR to handle non 
standard endpoints. I working with the SDK team to understand how best to do 
this - via the SDK or in S3A.





> Improve s3a region handling, including determining from endpoint
> ----------------------------------------------------------------
>
>                 Key: HADOOP-18908
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18908
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Ahmar Suhail
>            Priority: Major
>              Labels: pull-request-available
>
> s3a now requires the fs.s3a.endpoint.region to be set; and while it can 
> determine it from a network call, this takes time and doesn't work for third 
> party stores.
> proposed
> * reinstate parsing of the fs.3a.endpoint url to automatically determine 
> region from well known endoints (and vplink ones)
> * don't try to talk to AWS S3 if endpoint isn't an aws one: for that caller 
> must declare (HADOOP-18673)
>  * document this in v2 migration, including stack traces of falures



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to