[ 
https://issues.apache.org/jira/browse/HADOOP-16433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888976#comment-16888976
 ] 

Gabor Bota commented on HADOOP-16433:
-------------------------------------

Fixing this is not as straightforward as it may seem first because the 
lastUpdated field was 0 somehow in LocalMeatadataStore.
I found out that the reason was that we use DirListingMetadata different there 
than in other places: we store it in the cache rather than build and return it 
as in dynamo. The fix was to use PathMetadata in DirListingMetadata's put 
method because, in the end, we store PathMetadata inside the DirListingMetadata 
so 1. we should not hide that behavior from outside, 2. we lose data if we 
continue to use FileStatus instead of PathMetadata

> S3Guard: Filter expired entries and tombstones when listing with 
> MetadataStore#listChildren
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16433
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Gabor Bota
>            Assignee: Gabor Bota
>            Priority: Blocker
>
> Currently, we don't filter out entries in {{listChildren}} implementations.
> This can cause bugs and inconsistencies, so this should be fixed.
> It can lead to a status where we can't recover from the following:
> {{guarded and raw (OOB op) clients are doing ops to S3}}
> {noformat}
> Guarded: touch /AAAA
> Guarded: touch /ZZZZ
> Guarded: rm /AAAA {{-> tombstone in MS}}
> RAW: touch /AAAA/file.ext {{-> file is hidden with a tombstone}}
> Guarded: ls / {{-> only ZZZZ will show up in the listing. }}
> {noformat}
> After we change the following code
> {code:java}
>           final List<PathMetadata> metas = new ArrayList<>();
>           for (Item item : items) {
>             DDBPathMetadata meta = itemToPathMetadata(item, username);
>             metas.add(meta);
>           }
> {code}
> to 
> {code:java}
>             // handle expiry - only add not expired entries to listing.
>             if (meta.getLastUpdated() == 0 ||
>                 !meta.isExpired(ttlTimeProvider.getMetadataTtl(),
>                 ttlTimeProvider.getNow())) {
>               metas.add(meta);
>             }
> {code}
> we will filter out expired entries from the listing, so we can recover form 
> these kind of OOB ops.
> Note:  we have to handle the lastUpdated == 0 case, where the lastUpdated 
> field is not filled in!
> Note: this can only be fixed cleanly after HADOOP-16383 is fixed because we 
> need to have the TTLtimeProvider in MS to handle this internally.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to