[jira] [Commented] (HDDS-2328) Support large-scale listing
[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956216#comment-16956216 ] Anu Engineer commented on HDDS-2328: Agree. We should probably do what S3AFileSystem has done. > Support large-scale listing > > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Hanisha Koneru >Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time with {{RemoteIterator}} (didn't complete as > it was stuck in RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2328) Support large-scale listing
[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955756#comment-16955756 ] Lokesh Jain commented on HDDS-2328: --- Currently we do not implement FileSystem#listLocatedStatus api in Ozone. Therefore it ends up calling listStatus for the entire directory at once which can lead to OOM. I think we just need to have an implementation for listLocatedStatus and other such related apis in BasicOzoneFileSystem. > Support large-scale listing > > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Hanisha Koneru >Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time with {{RemoteIterator}} (didn't complete as > it was stuck in RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2328) Support large-scale listing
[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955650#comment-16955650 ] Rajesh Balamohan commented on HDDS-2328: Here is the small snippet of the code which was used large listing (directory I used had millions of entries, which was populated earlier). ozone src details: https://github.com/apache/hadoop-ozone (commit b4a1afd60e3a3c7319a1ffa97d5ace3a95ed26f6). {noformat} // Get path details ... ... long sTime = System.currentTimeMillis(); RemoteIterator rit = fs.listLocatedStatus(path); long count = 0 ; while(rit.hasNext()) { rit.next(); count++; } long eTime = System.currentTimeMillis(); ... ... {noformat} > Support large-scale listing > > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Hanisha Koneru >Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time with {{RemoteIterator}} (didn't complete as > it was stuck in RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2328) Support large-scale listing
[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954802#comment-16954802 ] Anu Engineer commented on HDDS-2328: The Listing API interface already does that. I will take a look at why we are not paging ... Can you please provide me with repro steps and which version of branch you tried with this ? > Support large-scale listing > > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Hanisha Koneru >Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time with {{RemoteIterator}} (didn't complete as > it was stuck in RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org