Steve Loughran created HADOOP-16412:
---------------------------------------

             Summary: S3a getFileStatus to update DDB if an S3 query returns 
etag/versionID
                 Key: HADOOP-16412
                 URL: https://issues.apache.org/jira/browse/HADOOP-16412
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.3.0
            Reporter: Steve Loughran


now that S3Guard tables support etags and version IDs, we should do more to 
populate this.

# listStatus/listFiles doesn't give us all the information; the AWS v1 and v2 
list operations only return the etags
# a treewalk on import with a HEAD on each object would be expensive and slow

What we can do is, on a getFileStatus call, update version markers to any 
S3Guard table entry where

* the etag is already in the S3Guard entry
* the probe of the store returns an entry with the same etag and a version ID

In that situation we know the S3 data and S3Guard data are consistent, so 
updating the version ID fills out the data. 

We could also think about updating etags from entries created by older versions 
of S3Guard; it'd be a bit trickier there to decide if the S3 store entry was 
current. Probably safest to leave alone...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to