[ 
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-13449:
-----------------------------------
    Attachment: HADOOP-13449-HADOOP-13345.005.patch

Thanks for the discussion, [~fabbri]. That's very helpful.

{quote}
for v1, you could always return authoritative = false. 
{quote}
Yes, it's the current patch. Let's address this as a follow-up JIRA after the 
[HADOOP-13651] and this both be committed.

{quote}
The interface allows any of these behaviors.... The filesystem is responsible 
for ensuring that the delete to /a must be recursive since it is not empty. 
MetadataStore explicitly does not do that.
{quote}
Agreed. For example, {{delete(path)}} does not check the directory path being 
empty.

{quote}
You either have to (A) pay money to store an extra copy of your metadata 
forever, or (B) spend money and time hydrating the MetadataStore each time you 
start a cluster.
{quote}
The metadata size is considered small and the price of DDB storage is low 
comparing with read/write operations pricing. If I have to choose, (A) makes 
more sense.

{quote}
and we don't assume everything is always in DynamoDB, it makes recovery much 
easier
{quote}
That's very valid. Altering S3 and MetadataStore is not atomic.

{quote}
The other concern is that I just don't understand why you would want to do the 
preloading.
{quote}
You mean import? I suppose not. For read/write existing s3 buckets, importing 
the structure first seems a prerequisite unless we assume it 
discovers/converges fast or we reach little consistency.
I guess you mean the constrictions on the pre-creating parent directories. I 
re-read the design doc and [HADOOP-13651] patch, and think you made a good 
point about this. Let S3AFileSystem ensure the contract.

Moreover, I now think storing the is_empty bit in DynamoDB is not ideal. 
Maintaining it needs non-trivial effort and it's easy to make it wrong. Perhaps 
we can query via parent directories as HASH key when we need this information. 
This is non-trivial either; I'll think about this as my next work. We can 
either fix this in next patch, or I'll work on a follow-up JIRA.

If this patch is still in question, a conference call will be very helpful. 
Let's schedule next week. [~ste...@apache.org] is traveling this week.

[~eddyxu] you have more comments since I revised the latest patch?

Thank you,

> S3Guard: Implement DynamoDBMetadataStore.
> -----------------------------------------
>
>                 Key: HADOOP-13449
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13449
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13449-HADOOP-13345.000.patch, 
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch, 
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch, 
> HADOOP-13449-HADOOP-13345.005.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to