[ 
https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955650#comment-16955650
 ] 

Rajesh Balamohan commented on HDDS-2328:
----------------------------------------


Here is the small snippet of the code which was used large listing (directory I 
used had millions of entries, which was populated earlier).

ozone src details: https://github.com/apache/hadoop-ozone (commit 
b4a1afd60e3a3c7319a1ffa97d5ace3a95ed26f6).

{noformat}
     // Get path details
        ...
        ...     
    long sTime = System.currentTimeMillis();
    RemoteIterator<LocatedFileStatus> rit = fs.listLocatedStatus(path);
    long count = 0 ;
    while(rit.hasNext()) {
      rit.next();
      count++;
    }
    long eTime = System.currentTimeMillis();
    ...
    ...
{noformat}

> Support large-scale listing 
> ----------------------------
>
>                 Key: HDDS-2328
>                 URL: https://issues.apache.org/jira/browse/HDDS-2328
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Rajesh Balamohan
>            Assignee: Hanisha Koneru
>            Priority: Major
>              Labels: performance
>
> Large-scale listing of directory contents takes a lot longer time and also 
> has the potential to run into OOM. I have > 1 million entries in the same 
> level and it took lot longer time with {{RemoteIterator}} (didn't complete as 
> it was stuck in RDB::seek).
> S3A batches it with 5K listing per fetch IIRC.  It would be good to have this 
> feature in ozone as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to