[ https://issues.apache.org/jira/browse/HADOOP-16801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026567#comment-17026567 ]
Hudson commented on HADOOP-16801: --------------------------------- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17920 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17920/]) HADOOP-16801. S3Guard listFiles will not query S3 if all listings are (github: rev 5977360878e6780bd04842c8a2156f9848e1d088) * (edit) hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestDynamoDBMetadataStoreAuthoritativeMode.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3Guard.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/ImportOperation.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreListFilesIterator.java * (edit) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > S3Guard listFiles will not query S3 if all listings are authoritative > --------------------------------------------------------------------- > > Key: HADOOP-16801 > URL: https://issues.apache.org/jira/browse/HADOOP-16801 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.3.0 > Reporter: Mustafa Iman > Assignee: Mustafa Iman > Priority: Minor > Attachments: HADOOP-aws-no-prefetch.prelim.patch > > > S3Guard does not respect authoritative metadatastore when listFiles is used > with recursive=true. It queries S3 even when given directory tree is 1-level > with no nested directories and the parent directory listing is authoritative. > S3Guard should check the listings in given directory tree for > authoritativeness and not query S3 when all listings in the tree are marked > as authoritative in metadata table (given metadatastore is configured to be > authoritative. > Below is the description of how the current code works: > S3AFileSystem#listFiles with recursive option, queries S3 even when directory > listing is authoritative. FileStatusListingIterator is created with given > entries from metadata store > [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Listing.java#L126] > . However, FileStatusListingIterator has an ObjectListingIterator that > prefetches from s3 regardless of authoritative listing. We observed this > behavior when using DynamDBMetadataStore. > I suppressed the unnecessary S3 calls by providing a dumb listing iterator to > listFiles call in the provided patch. Obviously this is not a solution. Just > demonstrating the source of the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org