[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-16085: ------------------------------------ Summary: S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite (was: S3Guard: use object version to protect against inconsistent read after replace/overwrite) > S3Guard: use object version or etags to protect against inconsistent read > after replace/overwrite > ------------------------------------------------------------------------------------------------- > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.2.0 > Reporter: Ben Roling > Priority: Major > Attachments: HADOOP-16085_002.patch, HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org