[ https://issues.apache.org/jira/browse/HADOOP-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-18338. ------------------------------------- Resolution: Not A Problem change the endpoint and s3a doesn't know what region to sign requests with. see HADOOP-17705 and set fs.s3a.bucket.region > Unable to access data from S3 bucket over a vpc endpoint - 400 bad request > -------------------------------------------------------------------------- > > Key: HADOOP-18338 > URL: https://issues.apache.org/jira/browse/HADOOP-18338 > Project: Hadoop Common > Issue Type: Bug > Components: common, fs/s3 > Reporter: Aarti > Priority: Major > Attachments: spark_s3.txt, spark_s3_vpce_error.txt > > > We are trying to write to S3 bucket which has policy with specific IAM Users, > SSE and endpoint. So this bucket has 2 endpoints mentioned in policy : > gateway endpoint and interface endpoint. > > When we use gateway endpoint which is general one: > [https://s3.us-east-1.amazonaws.com|https://s3.us-east-1.amazonaws.com/] => > spark code executes successfully and writes to S3 bucket > But when we use interface endpoint (which we have to use ideally): > [https://bucket.vpce-<>.s3.us-east-1.vpce.amazonaws.com|https://bucket.vpce-%3C%3E.s3.us-east-1.vpce.amazonaws.com/] > => spark code throws an error as : > > py4j.protocol.Py4JJavaError: An error occurred while calling o91.save. > : org.apache.hadoop.fs.s3a.AWSBadRequestException: doesBucketExist on <BUCKET > NAME>: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request > (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request > ID: BA67GFNR0Q127VFM; S3 Extended Request ID: > BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=; > Proxy: null), S3 Extended Request ID: > BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=:400 > Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: > 400 Bad Request; Request ID: BA67GFNR0Q127VFM; S3 Extended Request ID: > BopO6Cn1hNzXdWh89hZlnl/QyTJef/1cxmptuP6f4yH7tqfMO36s/7mF+q8v6L5+FmYHXbFdEss=; > Proxy: null) > > Attaching the pyspark code and exception trace > [^spark_s3.txt] > ^[^spark_s3_vpce_error.txt]^ -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org