[
https://issues.apache.org/jira/browse/HADOOP-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482540#comment-17482540
]
Bogdan Stolojan commented on HADOOP-18085:
------------------------------------------
Looked at the stack trace, yup I got this one too, this PR should address this.
> is this actually a change in AP ARNs, rather than just a test regression?
So the way I understand it is:
* S3A inits and reads AP ARN config
* It points S3A to the endpoint specified by the AP ARN (since ARN can contain
regions).
* To know the endpoint it relies on parsing from the SDK
* Since the SDK changed their parsing for AP endpoints (for some reason??) it
caused the tests to break in some confusing ways. It's confusing because AP
endpoints are different than S3 endpoints (format wise). So now even though
you're requesting an AP endpoint you get an bucket endpoint.
* One of the "confusing" ways is highlighted by that stack trace
** When S3 client builds requests, it looks at bucket URL and checks if the
region you're trying to make a request is in the same region as the bucket
** S3 client is also aware of access point ARNs so it knows how to parse the
AP ARNs and do this trick for them too
** However this backfires because you're using an ARN (as you should) and then
pass an S3 bucket endpoint (because the endpoint parsing was messed up) and
what you get is this confusing message.
Hope this helps.
And this is why in the PR fix for this we're making sure the endpoint starts
with `s3-accesspoint.` arguably quite weakly compared to how much confusion it
caused. Which makes me think I should add some more checks to it.
> S3 SDK Upgrade causes AccessPoint ARN endpoint mistranslation
> -------------------------------------------------------------
>
> Key: HADOOP-18085
> URL: https://issues.apache.org/jira/browse/HADOOP-18085
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3, test
> Affects Versions: 3.3.3
> Reporter: Bogdan Stolojan
> Assignee: Bogdan Stolojan
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Since upgrading the [SDK to
> 1.12.132|https://github.com/apache/hadoop/pull/3864] the access point
> endpoint translation was broken.
> Correct endpoints should start with "s3-accesspoint.", after SDK upgrade they
> start with "s3.accesspoint-" which messes up tests + region detection by the
> SDK.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]