[ https://issues.apache.org/jira/browse/HADOOP-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-17771. ------------------------------------- Fix Version/s: 3.3.2 Resolution: Fixed > S3AFS creation fails "Unable to find a region via the region provider chain." > ----------------------------------------------------------------------------- > > Key: HADOOP-17771 > URL: https://issues.apache.org/jira/browse/HADOOP-17771 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.3.1 > Environment: * fs.s3a.endpoint is unset > * Host outside EC2 > * without the file ~/.aws/config or without a region set in it > * without the system property aws.region declaring a region > * without the environment variable AWS_REGION declaring a region. > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Blocker > Labels: pull-request-available > Fix For: 3.3.2 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > If you don't have {{fs.s3a.endpoint}} set and lack a region set in > env var {{AWS_REGION_ENV_VAR}}, system property {{aws.region}} or the file > ~/.aws/config > then S3A FS creation fails with the message > "Unable to find a region via the region provider chain." > This is caused by the move to the AWS S3 client builder API in HADOOP-13551 > This is pretty dramatic and no doubt everyone will be asking "why didn't you > notice this?", > But in fact there are some reasons. > # when running in EC2, all is well. Meaning our big test runs were all happy. > # if a developer has fs.s3a.endpoint set for the test bucket, all is well. > Those of us who work with buckets in the "regions tend to do this, not > least because it can save a HEAD request every time an FS is created. > # if you have a region set in ~/.aws/config then all is well > reason #3 is the real surprise and the one which has really caught us out. > Even my tests against buckets in usw-2 through central didn't fail because of > course I, like my colleagues, have the AWS cli client installed locally. This > was sufficient to make the problem go away. It is also why this has been an > intermittent problem on test clusters outside AWS infra: it really depended > on the VM/docker image whether things worked or not. > h2. Quick Fix: set {{fs.s3a.endpoint}} to {{s3.amazonaws.com}} > If you have found this JIRA because you are encountering this problem, you > can fix it in by explicitly declaring the endpoint in {{core-site.xml}} > {code} > <property> > <name>fs.s3a.endpoint</name> > <value>s3.amazonaws.com</value> > </property> > {code} > For Apache Spark, this can be done in {{spark-defaults.conf}} > {code} > spark.hadoop.fs.s3a.endpoint s3.amazonaws.com > {code} > If you know the exact AWS region your data lives in, set the endpoint to be > that region's endpoint, and so save an HTTPS request to s3.amazonaws.com > every time an S3A Filesystem instance is created. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org