[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765489#comment-16765489 ]
Ben Roling commented on HADOOP-16085: ------------------------------------- I commented on HADOOP-15625: https://issues.apache.org/jira/browse/HADOOP-15625?focusedCommentId=16765486&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16765486 As mentioned there, I have a patch for that issue. I'm having trouble uploading it for some reason though. It is as though I don't have permission. The attachment area of the Jira doesn't look like it does on this issue where I AM allowed to upload. In that patch I elected to just use a vanilla IOException for the exception type. Alternative suggestions are welcome. > S3Guard: use object version to protect against inconsistent read after > replace/overwrite > ---------------------------------------------------------------------------------------- > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.2.0 > Reporter: Ben Roling > Priority: Major > Attachments: HADOOP-16085_002.patch, HADOOP-16085_3.2.0_001.patch > > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org