Steve Loughran created HADOOP-16412:
---------------------------------------
Summary: S3a getFileStatus to update DDB if an S3 query returns
etag/versionID
Key: HADOOP-16412
URL: https://issues.apache.org/jira/browse/HADOOP-16412
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Affects Versions: 3.3.0
Reporter: Steve Loughran
now that S3Guard tables support etags and version IDs, we should do more to
populate this.
# listStatus/listFiles doesn't give us all the information; the AWS v1 and v2
list operations only return the etags
# a treewalk on import with a HEAD on each object would be expensive and slow
What we can do is, on a getFileStatus call, update version markers to any
S3Guard table entry where
* the etag is already in the S3Guard entry
* the probe of the store returns an entry with the same etag and a version ID
In that situation we know the S3 data and S3Guard data are consistent, so
updating the version ID fills out the data.
We could also think about updating etags from entries created by older versions
of S3Guard; it'd be a bit trickier there to decide if the S3 store entry was
current. Probably safest to leave alone...
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]