[GitHub] [hadoop] bogthe opened a new pull request #3260: HADOOP-17198 Support S3 AccessPoint

GitBox Tue, 03 Aug 2021 10:07:34 -0700


bogthe opened a new pull request #3260:
URL: https://github.com/apache/hadoop/pull/3260



   [HADOOP-17198](https://issues.apache.org/jira/browse/HADOOP-17198)
   
   This change aims to add support for S3 AccessPoints. To use S3 object level
   APIs for an AccessPoint, one has to use the AccessPoint (AP) ARN.
   
   Hence the following have been added:
   - a new property to set the AccessPoint ARN;
   - S3a parsing and using the new property with appropriate exceptions;
   - initial documentation update for AccessPoints;
   
   What this PR enables:
   - If `apname` is the name of an AccessPoint you have for created bucket then 
S3a now allows you to use paths like `s3a://apname/` IF the new 
`s3a.accesspoint.arn` is set to the AccessPoint ARN e.g. 
`arn:aws:s3:eu-west-1:123456789101:accesspoint/apname`;
   
   There's one thing I'm not sure about with this initial implementation so am 
looking for feedback if and how I should tackle it:
   
   `S3a` bucket now has 2 "meanings" it can be a bucket name or an Access Point 
ARN. From the point of view of interacting with the SDK, they are 
interchangeable and internal string parsing logic is used to create the request 
for the right endpoint. However, I think it would be nicer to have a clearer 
abstraction for bucket names or access point ARNs that S3a operations can work 
with. This abstraction comes with the cost of doing a refactor which I'm not 
sure it's worth it right now. Even by doing a quick search on `.getHost()` 
there's quite a few places where the bucket name is deduced from the `host`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] bogthe opened a new pull request #3260: HADOOP-17198 Support S3 AccessPoint

Reply via email to