[ 
https://issues.apache.org/jira/browse/HADOOP-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785576#comment-15785576
 ] 

Steve Loughran commented on HADOOP-13926:
-----------------------------------------

One aspect of the list calls which return iterators is that they should ideally 
be designed to iterate over buckets containing millions of files, without 
worrying about memory or startup costs. You can see the performance diff if you 
try to do a listing of the landsat bucket on 2.8: the iterator works, vs 2.7: 
the code blocks for so long tests timeout, just because there are too many 
blobs to list in the treewalk.

we need to make sure this call (and related ones), (and implicitly s3guard), 
can handle paths with a few million child entries

> S3Guard: Improve listLocatedStatus
> ----------------------------------
>
>                 Key: HADOOP-13926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13926
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Rajesh Balamohan
>            Priority: Minor
>         Attachments: HADOOP-13926.wip.proto.branch-13345.1.patch
>
>
> Need to check if {{listLocatedStatus}} can make use of metastore's 
> listChildren feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to