[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955650#comment-16955650 ]
Rajesh Balamohan commented on HDDS-2328: ---------------------------------------- Here is the small snippet of the code which was used large listing (directory I used had millions of entries, which was populated earlier). ozone src details: https://github.com/apache/hadoop-ozone (commit b4a1afd60e3a3c7319a1ffa97d5ace3a95ed26f6). {noformat} // Get path details ... ... long sTime = System.currentTimeMillis(); RemoteIterator<LocatedFileStatus> rit = fs.listLocatedStatus(path); long count = 0 ; while(rit.hasNext()) { rit.next(); count++; } long eTime = System.currentTimeMillis(); ... ... {noformat} > Support large-scale listing > ---------------------------- > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager > Reporter: Rajesh Balamohan > Assignee: Hanisha Koneru > Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time with {{RemoteIterator}} (didn't complete as > it was stuck in RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org