ben-roling commented on issue #794: HADOOP-16085: use object version or etags 
to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/794#issuecomment-490537876
 
 
   I've pushed a commit that adds retries as discussed in 
https://github.com/apache/hadoop/pull/675#issuecomment-488614814
   
   The retries happen in S3AInputStream if the version doesn't match on initial 
open.  There are no retries if the version doesn't match on re-open (during 
seek() backwards).
   
   Retries also happen for rename() and select().
   
   Testing was added in ITestS3ARemoteFileChanged.  I used Mockito.spy() on the 
s3 client to stub in inconsistent responses until a threshold of retries is met.
   
   I've run the full test suite (against a bucket with versioning enabled in 
us-west-2):
   
   ```
   mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=8 -Ds3guard -Ddynamo
   ```
   
   ```
   [ERROR] Tests run: 896, Failures: 0, Errors: 2, Skipped: 145
   ```
   
   The two errors were in ITestDirectoryCommitMRJob and  
ITestS3GuardConcurrentOps, which succeeded when run individually:
   
   ```
   mvn -T 1C verify -Dtest=skip -Dit.test=ITestDirectoryCommitMRJob -Ds3guard 
-Ddynamo
   mvn -T 1C verify -Dtest=skip -Dit.test=ITestS3GuardConcurrentOps -Ds3guard 
-Ddynamo
   ```
   
   https://github.com/apache/hadoop/pull/675#issuecomment-488614814 suggests 
possibly different retry settings for these scenarios.  I haven't done that, at 
least yet.  Perhaps that can be carved off as another issue.  Similarly, I 
haven't implemented the HADOOP-13293 proposal.  I'm open to those things but 
would like to get the rest of this settled (merged) first if possible.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to