[ https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173430#comment-16173430 ]
Nandakumar edited comment on HDFS-12506 at 9/20/17 4:19 PM: ------------------------------------------------------------ +1 for [~xyao]'s idea, I was also thinking of the same. One small change though For Volume /#v1 For Bucket /v1/#b1 Keys can be stored as they are stored now With this we can iterate and get list of volumes without iterating over buckets, and get list of buckets without iterating over keys. Something like {code} /#v1 /#v2 /#v3 /v1/#b1 /v1/#b2 /v2/#b1 /v3/#b1 /v1/b1/k1 /v2/b2/k2 {code} was (Author: nandakumar131): +1 for [~xyao]'s idea, I was also thinking of the same. One small change though For Volume /#v1 For Bucket /v1/#b1 Keys can be stored as they are stored now With this we can iterate and get list of volumes without iterating over buckets, and get list of buckets without iterating over keys. Something lime {code} /#v1 /#v2 /#v3 /v1/#b1 /v1/#b2 /v2/#b1 /v3/#b1 /v1/b1/k1 /v2/b2/k2 {code} > Ozone: ListBucket is too slow > ----------------------------- > > Key: HDFS-12506 > URL: https://issues.apache.org/jira/browse/HDFS-12506 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Reporter: Weiwei Yang > Priority: Blocker > Labels: ozoneMerge > > Generated 3 million keys in ozone, and run {{listBucket}} command to get a > list of buckets under a volume, > {code} > bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei > {code} > this call spent over *15 seconds* to finish. The problem was caused by the > inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like > following > {code} > /v1/b1 > /v1/b1/k1 > /v1/b1/k2 > /v1/b1/k3 > /v1/b2 > /v1/b2/k1 > /v1/b2/k2 > /v1/b2/k3 > /v1/b3 > /v1/b4 > {code} > keys are sorted in nature order so when we do list buckets under a volume e.g > /v1, we need to seek to /v1 point and start to iterate and filter keys, this > ends up with scanning all keys under volume /v1. The problem with this design > is we don't have an efficient approach to locate all buckets without scanning > the keys. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org