Aaron Fabbri created HADOOP-15400:
-------------------------------------

             Summary: Improve S3Guard documentation on Authoritative Mode 
implemenation
                 Key: HADOOP-15400
                 URL: https://issues.apache.org/jira/browse/HADOOP-15400
             Project: Hadoop Common
          Issue Type: Improvement
          Components: fs/s3
    Affects Versions: 3.0.1
            Reporter: Aaron Fabbri


Part of the design of S3Guard is support for skipping the call to S3 
listObjects and serving directory listings out of the MetadataStore under 
certain circumstances.  This feature is called "authoritative" mode.  I've 
talked to many people about this feature and it seems to be universally 
confusing.

I suggest we improve / add a section to the s3guard.md site docs elaborating on 
what Authoritative Mode is.

It is *not* treating the MetadataStore (e.g. dynamodb) as the source of truth 
in general.

It *is* the ability to short-circuit S3 list objects and serve listings from 
the MetadataStore in some circumstances: 

For S3A to skip S3's list objects on some *path*, and serve it directly from 
the MetadataStore, the following things must all be true:
 # The MetadataStore implementation persists the bit 
{{DirListingMetadata.isAuthorititative}} set when calling 
{{MetadataStore#put(DirListingMetadata)}}
 # The S3A client is configured to allow metadatastore to be authoritative 
source of a directory listing (fs.s3a.metadatastore.authoritative=true).
 # The MetadataStore has a full listing for *path* stored in it.  This only 
happens if the FS client (s3a) explicitly has stored a full directory listing 
with {{DirListingMetadata.isAuthorititative=true}} before the said listing 
request happens.

Note that #1 only currently happens in LocalMetadataStore. Adding support to 
DynamoDBMetadataStore is covered in HADOOP-14154.

Also, the multiple uses of the word "authoritative" are confusing. Two meanings 
are used:
 1. In the FS client configuration fs.s3a.metadatastore.authoritative
 - Behavior of S3A code (not MetadataStore)
 - "S3A is allowed to skip S3.list() when it has full listing from 
MetadataStore"

2. MetadataStore
 When storing a dir listing, can set a bit isAuthoritative
 1 : "full contents of directory"
 0 : "may not be full listing"

Note that a MetadataStore *MAY* persist this bit. (not *MUST*).

We should probably rename the {{DirListingMetadata.isAuthorititative}} to 
{{.fullListing}} or at least put a comment where it is used to clarify its 
meaning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to