ben-roling commented on a change in pull request #666: HADOOP-16221 add option to fail operation on metadata write failure URL: https://github.com/apache/hadoop/pull/666#discussion_r271353374
########## File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md ########## @@ -183,6 +186,40 @@ removed on `S3AFileSystem` level. </property> ``` +#### Fail on Error + +By default, S3AFileSystem write operations will still succeed when updates to +S3Guard metadata fail. S3AFileSystem first writes the file to S3 and then +updates the metadata in S3Guard. If the metadata write fails, an error is +logged, but the overall write operation returns successfully. The file in +S3 **is not** rolled back. + +This is somewhat dangerous as it could result in the type of issue S3Guard is +designed to avoid. For example, a reader may see an inconsistent listing after +a recent write since S3Guard may not contain metadata about the recently +written file due to a metadata write error. + +This behavior can be changed by setting the following configuration: + +```xml +<property> + <name>fs.s3a.metadatastore.fail.on.write.error</name> + <value>true</value> +</property> +``` + +When set to true, a failure to save the metadata will fail the overall write +operation with `MetadataPersistenceException`. As with the default setting, +the new/updated file is still in S3 and **is not** rolled back. The S3Guard +metadata may (is likely to) be out of sync. + +The S3Guard metadata for the given file can be corrected with a command like Review comment: updated ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org