[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-11 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765489#comment-16765489
 ] 

Ben Roling commented on HADOOP-16085:
-

I commented on HADOOP-15625:

https://issues.apache.org/jira/browse/HADOOP-15625?focusedCommentId=16765486&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16765486

 

As mentioned there, I have a patch for that issue.  I'm having trouble 
uploading it for some reason though.  It is as though I don't have permission.  
The attachment area of the Jira doesn't look like it does on this issue where I 
AM allowed to upload.

In that patch I elected to just use a vanilla IOException for the exception 
type.  Alternative suggestions are welcome.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_002.patch, HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-08 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763676#comment-16763676
 ] 

Ben Roling commented on HADOOP-16085:
-

Thanks for the feedback [~ste...@apache.org].

With respect to the S3AFileSystem.getFileStatus() change I should have been a 
bit clearer.  I changed only the method signature, not the real type being 
returned.  S3AFileSystem.getFileStatus() is just a wrapper over 
innerGetFileStatus() which was already returning S3AFileStatus.  As such, it 
doesn't seem to me that it should have introduced any new serialization 
concerns, right?  I'll avoid the method signature change though and use casts 
where necessary instead.
{quote}IMO, failing because a file has been overwritten is fine, but ideally it 
should fail with a meaningful error, not EOF
{quote}
Fair point.  I was thinking to ask for feedback on the exception type in this 
scenario anyway but failed to do so with my last comment.  I chose EOFException 
to match the current behavior in the seek() after overwrite scenario and 
because I was having trouble choosing a better exception type.  I thought about 
possibly FileNotFoundException, but that didn't really feel right as the file 
does still exist.  I was thinking something more like 
ConcurrentModificationException, but that's more Java Collections oriented and 
not an IOException.  I wondered if there was an IOException similar to that 
defined somewhere but couldn't find one.  Another option I considered was 
creating a new IOException type within the S3A package.  I browsed other 
available IOException types and didn't see a good fit.  Did you have any 
specific suggestions?
{quote}One of the committer tests is going to have to be extended for this
{quote}
Ah, yeah, I'll need to dig deeper on that subject to better understand how 
those work and the updates that would be needed.
 # 
{quote}how about we start with HADOOP-15625 to make the input stream use etag 
to detect failures in a file, with tests to create those conditions{quote}

Sure, that sounds reasonable.  I'll create a patch for that.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_002.patch, HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-08 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763516#comment-16763516
 ] 

Steve Loughran commented on HADOOP-16085:
-

bq.   I changed S3AFileSystem.getFileStatus() to return S3AFileStatus instead 
of vanilla FileStatus.  I'm honestly not 100% sure if that creates any sort of 
compatibility problem or is in any other way objectionable.  If so, I could 
cast the status where necessary instead.

Probably Better to cast. The mocking used in some of the tests contains 
assumptions. A more subtle issue is that during benchmarking and things people 
add their own wrapper around a store (e.g. set fs.s3a.impl = wrapperclass). If 
any code ever assumes that the output of a filesystem created off an s3a URL is 
always S3AFileStatus, one day they'll be disappointed.

For code in that org.apache.hadoop.fs.s3a module itself, different story. 
Tighter coupling is allowed


h3. *serialization*. 

S3AFileStatus needs to be serializable; {{S3ATestUtils.roundTrip}} should 
verify that things round trip. As well as adding tests to verify etags come 
that way, the deser probably needs to handle the case that it came from an 
older version. (or we set a serialVersionUID, subclass Writable to check as 
part of the deserialization. 

Real {{FileStatus}} is marshalled between Namenode and clients a lot, hence all 
its protobuf representation. S3AFileStatus doesn't get marshalled that way but 
I know that I pass them around in Spark jobs, and so others may too. In that 
kind of use, we won't worry about handling cross-version wire formats, but at 
least recognising the problem would be good. (i.e for java serialization, 
change serialization ID; for Writable marshalling, well, who knows? Either 
leave out or add and fail badly if its not there).


bq, The next thing is that there is a slight behavior change with seek() if 
there are concurrent readers and writers (a problematic thing anyway).  With my 
changes, a seek() backwards will always result in EOFException on the next read 
within the S3AInputStream.  This happens because my changes pin the 
S3AInputStream to an eTag.  A seek() backwards causes a re-open and since the 
eTag on S3 will have changed with a new write, a read with the old eTag will 
fail.  I think this is actually desirable, but still worthy of mention.  The 
prior code would silently switch over to reading the new version of the file 
within the context of the same S3AInputStream.  Only if the new version of the 
file is shorter would an EOFException potentially happen when it seeks past the 
length of the new version of the file.

This is the same change as with HADOOP-15625, which I think I'm going to 
complete/merge in first, as its the simplest first step: detect and react to a 
change in versions during the duration of a read. 

All the hadoop fs specs are very ambiguous about what happens to a a source 
file which is changed while it is open, because every FS behaves differently. 
Changes may be observable to clients with an open handle, or they may not. And 
if it is observable, then exactly when the changes become visible is undefined. 
IMO, failing because a file has been overwritten is fine, but ideally it should 
fail with a meaningful error, not EOF, as that can be mapped to a -1 on a read, 
and so get treated specially. (remember, read(), read()) can trigger a new GET 
in random IO mode, so it may just be a read() call. What's interesting about S3 
and Swift is that as you seek around, *older versions of the data may become 
visible*. I've actually seen that on swift, never on s3, though their CRUD 
consistency model allows for it.

Actually, this has got me thinking a lot more about some of the nuances here, 
and what that means for testing. For this & HADOOP-15625, some tests will be 
needed for random IO, lazy seek, overwrite the data with something shorter, 
brief wait for stability and then call read(), expecting a more detailed error 
than just -1, which is what you'd see today.

Other failure cases to create in tests, doing these before any changes just to 
see what happens today

* file deleted after input stream opened but before first read; then between 
positioned reads
* seek to EOF-1, overwrite file with longer, do 1+ read() to see what happens
* file overwritten with 0 byte file after open & before first read. That's 
always one of the corner cases which breaks things.


w.r.t finished write, yes, sounds like a failure is the right approach. We may 
want to think more about how to do some fault injection with S3Guard IO; the 
LocalMetadataStore would be the safest place to add this.

One of the committer tests is going to have to be extended for this, and/or the 
multipart upload contract tests, to verify that updates to a file uploaded that 
way has s3guard's etags updated. Should be a matter of getting the etags from 
that completed write 

[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-07 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763130#comment-16763130
 ] 

Ben Roling commented on HADOOP-16085:
-

I've uploaded a new patch based on trunk and storing eTag in S3Guard instead of 
versionId.  The existing and my new tests pass, but there are a few things 
worth mentioning.

I'll start with the simplest one.  I changed S3AFileSystem.getFileStatus() to 
return S3AFileStatus instead of vanilla FileStatus.  I'm honestly not 100% sure 
if that creates any sort of compatibility problem or is in any other way 
objectionable.  If so, I could cast the status where necessary instead.

The next thing is that there is a slight behavior change with seek() if there 
are concurrent readers and writers (a problematic thing anyway).  With my 
changes, a seek() backwards will always result in EOFException on the next read 
within the S3AInputStream.  This happens because my changes pin the 
S3AInputStream to an eTag.  A seek() backwards causes a re-open and since the 
eTag on S3 will have changed with a new write, a read with the old eTag will 
fail.  I think this is actually desirable, but still worthy of mention.  The 
prior code would silently switch over to reading the new version of the file 
within the context of the same S3AInputStream.  Only if the new version of the 
file is shorter would an EOFException potentially happen when it seeks past the 
length of the new version of the file.

Finally is the worst of the issues.  I realized that if an overwrite of a file 
succeeds on S3 but fails during the S3Guard update (e.g. exception 
communicating with Dynamo), from the client's perspective the update was 
successful.  S3AFileSystem.finishedWrite() simply logs an error for the S3Guard 
issue and moves on.  However, any subsequent read of the file will fail.  The 
read will fail because S3Guard still has the old eTag and any read is going to 
use the S3Guard eTag when calling through to GetObject on S3.  This will not 
return anything as the eTag doesn't match.

This led me to thinking I should update the exception handling in 
S3AFileSystem.finishedWrite() to allow the IOException on S3Guard update to 
propagate rather than be caught and logged.  This should at least trigger the 
writer to realize something went wrong and take some action.  Really all it 
seems the writer can do to resolve the situation is write the file again.  
Assuming the new write goes through, S3Guard will get the correct new eTag and 
all will be well again.  I have not made this update yet though.  Thoughts on 
that?

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_002.patch, HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-07 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763078#comment-16763078
 ] 

Aaron Fabbri commented on HADOOP-16085:
---

{quote}My other concern is that this requires enabling object versioning. I 
know Aaron Fabbri has done some testing with that and I think eventually hit 
issues. Was it just a matter of the space all the versions were taking up, or 
was it actually a performance problem once there was enough overhead?{quote}

Yeah I had  "broken" certain paths (keys) on an s3 bucket by leaving versioning 
enabled on a dev bucket where I'd frequently delete and recreate the same keys. 
 There appeared to be some scalability limit on the number of versions a 
particular key can have.  So the lifecycle policy to purge old versions would 
be important I think.

I share [~ste...@apache.org]'s hesitation on doing this all in S3Guard.  Just 
from experience with all the corner cases and S3 flakiness. I'm glad you are 
looking into it and prototyping, though; we want more people to learn this 
codebase.


> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758741#comment-16758741
 ] 

Ben Roling commented on HADOOP-16085:
-

Oh, I see.  For us overwrites would be rare but we would want protection 
against an inconsistent read afterwards nonetheless.  We would implement 
lifecycle policy to get rid of old versions.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Andrew Olson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758736#comment-16758736
 ] 

Andrew Olson commented on HADOOP-16085:
---

[~mackrorysd] Looks like it's possible to create a [bucket lifecycle 
rule|https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html]
 to automatically purge out old object versions after some specified expiration 
time, which would be helpful to keep the S3 space from indefinitely growing if 
the same file paths are being continually overwritten.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758726#comment-16758726
 ] 

Sean Mackrory commented on HADOOP-16085:


{quote}object version can be up to 1024 characters{quote}
I'm less concerned about the space taken up in the metadata store - the 
problems I'm trying to remember were due to the amount of space in the S3 
bucket itself. It was with repeated test runs, so there were similar filenames 
used many, many times (which is not that realistic, but the more you care about 
read-after-update consistency, the more this would impact you), so many 
versions had been kept.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758719#comment-16758719
 ] 

Ben Roling commented on HADOOP-16085:
-

Thanks for keeping the feedback coming!

 

[~ste...@apache.org]
{quote}if you use the S3A committers for your work, and the default mode 
-insert a guid into the filename- then filenames are always created unique.  It 
becomes impossible to get a RAW inconsistency. This is essentially where we are 
going, along with Apache Iceberg (incubating).
{quote}
Most of our processing is currently in Apache Crunch, for which the S3A 
committers don't really seem to apply at the moment.

I've seen the Apache Iceberg project and it does look quite interesting.  It's 
not practical for us to get everything to Iceberg before moving things to S3 
through.  We'll probably look at it closer in the future.
{quote}I like etags because they are exposed in getFileChecksum(); their flaw 
is that they can be very large on massive MPUs (32bytes/block uploaded).
{quote}
I'm not sure what you mean about getFileChecksum()?  I would expect to pull the 
etags from PutObjectResult.getETag() and 
CompleteMultipartUploadResult.getETag().  It doesn't seem necessary to me to 
track etag per block uploaded.  Is there something I am missing?
{quote}BTW, if you are worried about how observable is eventual consistency, 
generally its delayed listings over actual content. There's a really good paper 
with experimental data which does measure how often you can observe RAW 
inconsistencies [http://www.aifb.kit.edu/images/8/8d/Ic2e2014.pdf]
{quote}
Thanks for the reference.  I happened upon a link to that from 
[are-we-consistent-yet|https://github.com/gaul/are-we-consistent-yet] as well.  
I need to have a full read through it.

 

[~mackrorysd]
{quote}we need to gracefully deal with any row missing an object version. The 
other direction is easy - if this simply adds a new field, old code will ignore 
it and we'll continue to get the current behavior.
{quote}
I don't think this is too much of a problem.  I believe the code in my patch 
already handles it.

 
{quote}My other concern is that this requires enabling object versioning. I 
know [~fabbri] has done some testing with that and I think eventually hit 
issues. Was it just a matter of the space all the versions were taking up, or 
was it actually a performance problem once there was enough overhead?
{quote}
I'd like to hear more about this.  From a space perspective, the [S3 
documentation|https://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html]
 says object version can be up to 1024 characters but in my experience it looks 
like they are 32 (the same length as etag).  As I mentioned before, I'm looking 
at switching the patch over to use etag instead of object version anyway 
though.  I haven't gotten around to the code changes for it yet but it doesn't 
seem like it would be that much.  It's just a different field on 
PutObjectResult, CompleteMultipartUploadResult, and GetObjectRequest.

 

 

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---

[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758672#comment-16758672
 ] 

Sean Mackrory commented on HADOOP-16085:


Thanks for submitting a patch [~ben.roling]. Haven't had a chance to do a full 
review yet, but one of [~fabbri]'s comments was also high on my list of things 
to watch out for:
{quote}Backward / forward compatible with existing S3Guarded buckets and Dynamo 
tables.{quote}
Specifically, we need to gracefully deal with any row missing an object 
version. The other direction is easy - if this simply adds a new field, old 
code will ignore it and we'll continue to get the current behavior.

My other concern is that this requires enabling object versioning. I know 
[~fabbri] has done some testing with that and I think eventually hit issues. 
Was it just a matter of the space all the versions were taking up, or was it 
actually a performance problem once there was enough overhead?

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758577#comment-16758577
 ] 

Steve Loughran commented on HADOOP-16085:
-

the enemy here is eventual consistency. Which is of course the whole reason 
S3Guard was needed. 

What issues are we worrying about

# mixed writer: some not going with s3guard, some doing. Even in nonauth mode, 
I worry about delete tombstones.
# failure during large operations and so s3 not being in sync with the store.
# failure during a workflow with one or more GET calls on the second attempt 
picking up the old version.

HADOOP-15625 is going to address the changes within an open file through etag 
comparison, but without the etag being cached in the S3Guard repo, it's not 
going to detect inconsistencies between the version expected and the version 
read.

Personally, I'm kind of reluctant to rely on S3Guard for being the sole defence 
against this problem. 

bq.  a re-run of a pipeline stage should always use a new output directory,

if you use the S3A committers for your work, and the default mode -insert a 
guid into the filename- then filenames are always created unique. It becomes 
impossible to get a RAW inconsistency. This is essentially where we are going, 
along with Apache Iceberg (incubating). Rather than jump through 
hoop-after-hoop of workarounds for S3s apparent decision to never deliver 
consistent views, come up with data structures which only need one point of 
consistency (you need to know the unique filename of the latest iceberg file).

Putting that aside, yes, keeping version markers would be good. I like etags 
because they are exposed in getFileChecksum(); their flaw is that they can be 
very large on massive MPUs (32bytes/block uploaded). 


BTW, if you are worried about how observable is eventual consistency, generally 
its delayed listings over actual content. There's a really good paper with 
experimental data which does measure how often you can observe RAW 
inconsistencies http://www.aifb.kit.edu/images/8/8d/Ic2e2014.pdf


> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758479#comment-16758479
 ] 

Ben Roling commented on HADOOP-16085:
-

I've attached my first version of the patch.  It's based on 3.2.0 rather than 
trunk.  I was having a bit of trouble with the build and running the tests when 
I was working off master and I switched over to 3.2.0.  I need to port it back 
over to trunk but there are some conflicts I need to resolve in doing so.  
Anyway, if anyone wants to have a look at the initial patch it should give you 
a general idea of the changes.  It's not terribly complicated.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
> Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-02-01 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758432#comment-16758432
 ] 

Ben Roling commented on HADOOP-16085:
-

Thanks for the thoughts [~fabbri].  I have an initial patch that I will upload 
soon.  The patch stores object versionId and as such only provides 
read-after-overwrite protection if object versioning is enabled.  If object 
versioning is not enabled on the bucket, things would function the same as 
before.

I hadn't really considered storing eTags instead.  I'll look at the feasibility 
of doing that as it could remove the dependency on enabling object versioning 
to make the feature more broadly applicable.  I think my organization is likely 
to enable object versioning anyway, but if S3Guard doesn't depend on it then 
more folks may benefit.

Thanks for your list of considerations.  Here are some responses:
 * The feature is always enabled and adds zero round trips.  versionId was 
available already on the PutObject response so I'm just capturing it and 
storing it.  This is the way it is in my patch anyway.  I'm curious for 
feedback if you believe there should be a capability to toggle the feature on 
or off?
 * There isn't really a conflict resolution policy.  If we have versionId in 
the metadata, we provide it on the GetObject request.  We either get back what 
we are looking for or a 404.  I'm guessing 404s are not going to happen (except 
if the object is deleted outside the context of S3Guard, but that's outside the 
scope of this).  I assume read-after-overwrite inconsistencies in general with 
S3 happen due to cache hits on the old version, but when the version (eTag or 
versionId) is explicitly specified there should be no cache hit and we would 
get the same read-after-write consistency as you get on an initial PUT (no 
overwrite).  Even if I am wrong, the worst case is you get a 
FileNotFoundException, which is much better than an inconsistent read and no 
error.  Retries could be added on 404, but maybe wait until it is proven they 
are necessary.
 * I'm not trying to protect against a racing writer issue.  I can add 
something to the documentation about it.
 * The changes are backward and forward compatible with existing buckets and 
tables.  The new versionId attribute is optional.
 * MetadataStore expiry should be fine.  The versionId is optional.  If it 
isn't there, no problem.  The only risk of inconsistent read-after-overwrite is 
if the metadata is purged more quickly than S3 itself becomes 
read-after-overwrite consistent for the object being read.  I can update 
documentation to mention this.
 * I guess with regard to HADOOP-15779, there could be a new type of S3Guard 
metadata inconsistency.  If an object is overwritten outside of S3Guard, 
S3Guard will not have the correct eTag or versionId and the reader may end up 
seeing either a 404 or the old content.  Prior to this, the reader would see 
whatever content S3 returns on a GET that is not qualified by eTag or versionId.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apach

[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-01-31 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757837#comment-16757837
 ] 

Aaron Fabbri commented on HADOOP-16085:
---

Hi guys. We've thought about this issue a little in the past. You are right 
that S3Guard mostly focuses on metadata consistency. There is some degree of 
data consistency added (e.g. it stops you from reading deleted files or from 
missing recently created ones), but we don't store etags or object versions 
today.

Working on a patch would be a good learning experience for the codebase, which 
I encourage. Also feel free to send S3Guard questions our way (even better ask 
on the email list and cc: us so others can learn as well.) The implementation 
would need to consider some things (off the top of my head) below. Not 
necessary for an RFC patch but hope it helps with the concepts.
 - Should be zero extra round trips when turned off (expense in $ and 
performance).
 - Would want to figure out where we'd need additional round trips and decide 
if it is worth it. Tests that assert certain number of S3 ops will need to be 
made aware, and documentation should outline the marginal cost of the feature).
 - What is the conflict resolution policy and how is it configured? If we get 
an unexpected etag/version on read, what do we do? (e.g. retry policy then give 
up, or retry then serve non-matching data. In latter case, do we update the 
S3Guard MetadataStore with the etag/version we ended up getting from S3?)
 - The racing writer issue. IIRC two writers racing to write the same object 
(path) in S3 cannot tell which of them will actually have their version 
materialized, unless versioning is turned on. This means if we supported this 
feature without versioning (just etags) it would be prone to the same sort of 
concurrent modification races that S3 has today. We at least need to document 
the behavior.
 - Backward / forward compatible with existing S3Guarded buckets and Dynamo 
tables.
 - Understand and document any interactions with MetadataStore expiry (related 
jira). In general, data can be expired or purged from the MetadataStore and the 
only negative consequence should be falling back to raw-S3 like consistency 
temporarily. This allows demand-loading the MetadataStore and implementing 
caching with the same APIs.
 - Another semi-related Jira to check out 
[here|https://issues.apache.org/jira/browse/HADOOP-15779].

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-01-30 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756508#comment-16756508
 ] 

Ben Roling commented on HADOOP-16085:
-

Thanks for the feedback.  I've went ahead and started drafting a patch.  I'll 
appreciate any review you can provide when it is ready.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-01-30 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756448#comment-16756448
 ] 

Gabor Bota commented on HADOOP-16085:
-

What you've described is a different issue then I answered for in my for 
comment. It could be new a feature in S3Guard, and as I can see 
[~ste...@apache.org] already moved this to the uber jira.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-01-29 Thread Ben Roling (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755292#comment-16755292
 ] 

Ben Roling commented on HADOOP-16085:
-

[~gabor.bota] thanks for the response.  You are hitting on a different issue 
than I was trying to get at.  The problem I am referencing (assuming I 
understand the system well enough myself) is present even if all parties are 
always using S3Guard with the same config.

Consider a process writes s3a://my-bucket/foo.txt with content "abc".  Another 
process comes along later and does an overwrite of s3a://my-bucket/foo.txt with 
"def".  Finally, a reader process reads s3a://my-bucket/foo.txt.  S3Guard 
ensures the reader knows that s3a://my-bucket/foo.txt exists and that the 
reader sees something, but does nothing to ensure the reader sees "def" instead 
of "abc" as the content.

That was a contrived example.  One place I see this as likely to occur is 
during failure and retry scenarios of multi-stage ETL pipelines.  Stage 1 runs, 
writing to s3://my-bucket/output, but fails after writing only some of the 
output files that were supposed to go into that directory.  The stage is re-run 
with the same output directory in an overwrite mode such that the original 
output is deleted and the job is rerun with the same s3://my-bucket/output 
target directory.  This time the run is successful, so the ETL continues to 
Stage 2, passing s3://my-bucket/output as the input.  When Stage 2 runs, 
S3Guard ensures it sees the correct listing of files within 
s3://my-bucket/output, but does nothing to ensure it reads the correct version 
of each of these files if the output happened to vary in any way between the 
first and second execution of Stage 1.

One way to avoid this is to suggest that a re-run of a pipeline stage should 
always use a new output directory, but that is not always practical.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

2019-01-29 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755248#comment-16755248
 ] 

Gabor Bota commented on HADOOP-16085:
-

Hi [~ben.roling], 

I think I know about the problem you are talking. We know about this issue, but 
we try to solve it from another end. 

In a nutshell: We have a When using S3 with S3Guard, ALL of the clients using 
the S3Guarded bucket should use the same Dynamo table. When you don't use 
S3Guard or use the S3 bucket with another dynamo table, and you do 
modifications and no just read that's an "out of band operation". We don't 
support this now.

We created the following Jira for this: HADOOP-15999, and I'm currently working 
on it to fix this.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> 
>
> Key: HADOOP-16085
> URL: https://issues.apache.org/jira/browse/HADOOP-16085
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Ben Roling
>Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org