bogthe commented on a change in pull request #3260: URL: https://github.com/apache/hadoop/pull/3260#discussion_r682739268
########## File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md ########## @@ -1576,6 +1576,81 @@ Why explicitly declare a bucket bound to the central endpoint? It ensures that if the default endpoint is changed to a new region, data store in US-east is still reachable. +## <a name="accesspoints"></a>Configuring S3 AccessPoints usage with S3a +S3a now supports [S3 Access Point](https://aws.amazon.com/s3/features/access-points/) usage which +improves VPC integration with S3 and simplifies your data's permission model because different +policies can be applied now on the Access Point level. For more information about why to use them +make sure to read the official documentation. + +Accessing data through an access point, is done by using its ARN, as opposed to just the bucket name. +You can set the Access Point ARN property using the following configuration property: +```xml +<property> + <name>fs.s3a.accesspoint.arn</name> + <value> {ACCESSPOINT_ARN_HERE} </value> + <description>Configure S3a traffic to use this AccessPoint</description> +</property> +``` + +Be mindful that this configures **all access** to S3a, and in turn S3, to go through that ARN. +So for example `s3a://yourbucket/key` will now use your configured ARN when getting data from S3 +instead of your bucket. The flip side to this is that if you're working with multiple buckets Review comment: I like it. Will update PR ########## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java ########## @@ -2570,6 +2614,11 @@ protected S3ListResult continueListObjects(S3ListRequest request, OBJECT_CONTINUE_LIST_REQUEST, () -> { if (useListV1) { + if (accessPoint != null) { + // AccessPoints are not compatible with V1List + throw new InvalidRequestException("ListV1 is not supported by AccessPoints"); Review comment: Yep, good idea, upgrading it is! ########## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java ########## @@ -400,6 +410,14 @@ public void initialize(URI name, Configuration originalConf) LOG.debug("Initializing S3AFileSystem for {}", bucket); // clone the configuration into one with propagated bucket options Configuration conf = propagateBucketOptions(originalConf, bucket); + + String apArn = conf.getTrimmed(ACCESS_POINT_ARN, ""); + if (!apArn.isEmpty()) { + accessPoint = ArnResource.accessPointFromArn(apArn); + LOG.info("Using AccessPoint ARN \"{}\" for bucket {}", apArn, bucket); Review comment: good point -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org