[ https://issues.apache.org/jira/browse/HADOOP-16433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Bota updated HADOOP-16433: -------------------------------- Description: Currently, we don't filter out entries in {{listChildren}} implementations. This can cause bugs and inconsistencies, so this should be fixed. It can lead to a status where we can't recover from the following: {{guarded and raw (OOB op) clients are doing ops to S3}} {noformat} Guarded: touch /AAAA Guarded: touch /ZZZZ Guarded: rm /AAAA {{-> tombstone in MS}} RAW: touch /AAAA/file.ext {{-> file is hidden with a tombstone}} Guarded: ls / {{-> the directory is empty}} {noformat} After we change the following code {code:java} final List<PathMetadata> metas = new ArrayList<>(); for (Item item : items) { DDBPathMetadata meta = itemToPathMetadata(item, username); metas.add(meta); } {code} to {code:java} // handle expiry - only add not expired entries to listing. if (meta.getLastUpdated() == 0 || !meta.isExpired(ttlTimeProvider.getMetadataTtl(), ttlTimeProvider.getNow())) { metas.add(meta); } {code} we will filter out expired entries from the listing, so we can recover form these kind of OOB ops. Note: we have to handle the lastUpdated == 0 case, where the lastUpdated field is not filled in! Note: this can only be fixed cleanly after HADOOP-16383 is fixed because we need to have the TTLtimeProvider in MS to handle this internally. was: Currently, we don't filter out entries in {{listChildren}} implementations. This can cause bugs and inconsistencies, so this should be fixed. It can lead to a status where we can't recover from the following: {{guarded and raw (OOB op) clients are doing ops to S3}} {noformat} Guarded: touch /AAAA Guarded: touch /ZZZZ Guarded: rm /AAAA {{-> tombstone in MS}} RAW: touch /AAAA/file.ext {{-> file is hidden with a tombstone}} Guarded: ls / {{-> the directory is empty}} {noformat} After we change the following code {code:java} final List<PathMetadata> metas = new ArrayList<>(); for (Item item : items) { DDBPathMetadata meta = itemToPathMetadata(item, username); metas.add(meta); } {code} to {code:java} // handle expiry - only add not expired entries to listing. if (meta.getLastUpdated() == 0 || !meta.isExpired(ttlTimeProvider.getMetadataTtl(), ttlTimeProvider.getNow())) { metas.add(meta); } {code} we will filter out expired entries from the listing, so we can recover form these kind of OOB ops. > Filter expired entries and tombstones when listing with > MetadataStore#listChildren > ---------------------------------------------------------------------------------- > > Key: HADOOP-16433 > URL: https://issues.apache.org/jira/browse/HADOOP-16433 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.3.0 > Reporter: Gabor Bota > Assignee: Gabor Bota > Priority: Major > > Currently, we don't filter out entries in {{listChildren}} implementations. > This can cause bugs and inconsistencies, so this should be fixed. > It can lead to a status where we can't recover from the following: > {{guarded and raw (OOB op) clients are doing ops to S3}} > {noformat} > Guarded: touch /AAAA > Guarded: touch /ZZZZ > Guarded: rm /AAAA {{-> tombstone in MS}} > RAW: touch /AAAA/file.ext {{-> file is hidden with a tombstone}} > Guarded: ls / {{-> the directory is empty}} > {noformat} > After we change the following code > {code:java} > final List<PathMetadata> metas = new ArrayList<>(); > for (Item item : items) { > DDBPathMetadata meta = itemToPathMetadata(item, username); > metas.add(meta); > } > {code} > to > {code:java} > // handle expiry - only add not expired entries to listing. > if (meta.getLastUpdated() == 0 || > !meta.isExpired(ttlTimeProvider.getMetadataTtl(), > ttlTimeProvider.getNow())) { > metas.add(meta); > } > {code} > we will filter out expired entries from the listing, so we can recover form > these kind of OOB ops. > Note: we have to handle the lastUpdated == 0 case, where the lastUpdated > field is not filled in! > Note: this can only be fixed cleanly after HADOOP-16383 is fixed because we > need to have the TTLtimeProvider in MS to handle this internally. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org