noslowerdna commented on a change in pull request #606: HADOOP-16190. S3A copyFile operation to include source versionID or etag in the copy request URL: https://github.com/apache/hadoop/pull/606#discussion_r268802761
########## File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md ########## @@ -288,6 +288,57 @@ For the default test dataset, hosted in the `landsat-pds` bucket, this is: </property> ``` +## <a name="versions"></a> Testing against versioned buckets + +AWS S3 and some third party stores support versioned buckets. + +Hadoop is adding awareness of this, including + +* Using version ID to guarantee consistent reads of opened files. + [HADOOP-15625](https://issues.apache.org/jira/browse/HADOOP-15625) +* Using version ID to guarantee consistent multipart copies. +* Checks to avoid creating needless delete markers. + ++ maybe more to come. + +To test these features, you need to have buckets with object versioning +enabled. + +A full `hadoop-aws` test run implicitly cleans up all files in the bucket +in `ITestS3AContractRootDir`, so all every test run creates a large set of +old (deleted) file versions. To avoid large bills, you must +create a lifecycle rule on the bucket to purge the old versions. Review comment: maybe link to https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org