[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2018-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455246#comment-16455246
 ] 

Hudson commented on HDFS-12506:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14070 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14070/])
HDFS-12506. Ozone: ListBucket is too slow. Contributed by Weiwei Yang. 
(omalley: rev fd1564b87ec557638925730a003ba0c4e8926cf8)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManagerImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/TestMetadataStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/scm/cli/SQLCLI.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestVolume.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestBuckets.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/RocksDBStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/MetadataStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/LevelDBStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/ksm/TestBucketManagerImpl.java


> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Fix For: HDFS-7240
>
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2018-04-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451011#comment-16451011
 ] 

Hudson commented on HDFS-12506:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14057/])
HDFS-12506. Ozone: ListBucket is too slow. Contributed by Weiwei Yang. (wwei: 
rev e01245495f71a20a5478c29c32d849d4b2720c57)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManagerImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/TestMetadataStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/MetadataStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/RocksDBStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestBuckets.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/scm/cli/SQLCLI.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/ksm/TestBucketManagerImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/utils/LevelDBStore.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/ozone/web/client/TestVolume.java


> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Fix For: HDFS-7240
>
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-25 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178744#comment-16178744
 ] 

Weiwei Yang commented on HDFS-12506:


Thanks for the quick response [~linyiqun], I am going to commit this patch 
shortly.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-25 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178741#comment-16178741
 ] 

Yiqun Lin commented on HDFS-12506:
--

[~cheersyang], feel free to commit, the javadoc @link looks right and can be 
linked to method {{getRangeKVs}} in my local.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-25 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178712#comment-16178712
 ] 

Weiwei Yang commented on HDFS-12506:


The javadoc warnings seems to be a false alarm, it doesn't understand doc 
reference

{code}
/** 
 * {@link #getRangeKVs(byte[], int, MetadataKeyFilter...)}
 **/
{code}

which works perfectly on my intellji IDE. I manually run {{mvn 
javadoc:javadoc}}, I got following warnings

{noformat}
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java:45:
 warning - Tag @link: reference not found: 
org.apache.ratis.statemachine.StateMachine
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/AbstractFuture.java:1274:
 warning - Tag @link: can't find newDirectExecutorService() in 
org.apache.hadoop.hdfs.server.datanode.checker.AbstractFuture
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/AbstractFuture.java:1274:
 warning - Tag @link: reference not found: CallerRunsPolicy
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/XceiverServerHandler.java:75:
 warning - Tag @link: reference not found: 
ChannelHandlerContext#fireExceptionCaught(Throwable)
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/protocolPB/KeySpaceManagerProtocolServerSideTranslatorPB.java:95:
 warning - Tag @link: reference not found: 
org.apache.hadoop.ozone.ksm.protocolPB.KeySpaceManagerProtocolPB
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/BucketManager.java:76:
 warning - Tag @link: reference not found: KsmBucketInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/BucketManager.java:76:
 warning - Tag @link: reference not found: KsmBucketInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManager.java:164:
 warning - Tag @link: reference not found: KsmBucketInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManager.java:164:
 warning - Tag @link: reference not found: KsmBucketInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KeyManager.java:101:
 warning - Tag @link: reference not found: KsmKeyInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/ksm/KSMMetadataManager.java:188:
 warning - Tag @link: reference not found: KsmKeyInfo
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/web/netty/RequestContentObjectStoreChannelHandler.java:42:
 warning - Tag @link: reference not found: HttpContent
[WARNING] 
/Users/yangwwei/IdeaProjects/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/ozone/web/netty/RequestDispatchObjectStoreChannelHandler.java:44:
 warning - Tag @link: reference not found: HttpRequest
{noformat}

It looks like javadoc plugin doesn't recognize these links. I think we can 
commit this, this 2 links are useful to help people know the difference of two 
methods.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets unde

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178569#comment-16178569
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
30s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
51s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 2 new + 10 
unchanged - 0 fixed = 12 total (was 10) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
32s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888770/HDFS-12506-HDFS-7240.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7e1a77b55c73 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 97ff55e |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21332/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21332/artifact/patchprocess/diff-javadoc-javadoc-

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-24 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178512#comment-16178512
 ] 

Anu Engineer commented on HDFS-12506:
-

bq. JIRA HDFS-12539 to get these stuff fixed. Does that sound good to you?
Perfect, +1 on this change.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-24 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178500#comment-16178500
 ] 

Yiqun Lin commented on HDFS-12506:
--

I'm okay on your comment. +1, pending Jenkins.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-24 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178495#comment-16178495
 ] 

Weiwei Yang commented on HDFS-12506:


Hi [~linyiqun]

I just uploaded v7 patch that hopefully fixed the java doc warnings. And 
regarding to your comment

bq. getSequentialRangeKVs can also make sense in listKeys

Actually there are more places should be replaced with 
{{getSequentialRangeKVs}}, I did not include them in this patch because I 
haven't tested them all. I will open another JIRA to track this issue, and make 
sure they get fixed with sufficient testing. Lets keep this JIRA focus on 
fixing {{listBucket}} issue. Does that sound good to you?

[~anu], thanks for reviewing this patch, since your comments are not from the 
changes introduced by this patch, I have opened another lower priority cleanup 
JIRA HDFS-12539 to get these stuff fixed. Does that sound good to you?

Thanks

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch, HDFS-12506-HDFS-7240.007.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-24 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178356#comment-16178356
 ] 

Anu Engineer commented on HDFS-12506:
-

[~cheersyang], Thanks for finding and fixing this issue. I am +1 overall on 
this change, I have some minor comments.

1. nit: Can you please file a later clean up JIRA to rename these functions? I 
know these are not related to
your patch, but I noticed these while reading code.
* getBucketKeyPrefix --> getBucketWithDBPrefix
* getKeyKeyPrefix --> getKeyWithDBPrefix
* getDBKeyForKey-> getDBKeyBytes and it is possible to rewrite 
getDBKeyForKey as 
{{return(DFSUtil.string2Bytes(getKeyWithDBPrefix()))}}

2. nit: Again not related to your change, instead of doing this in many places 
of code 
{{OzoneConsts.KSM_KEY_PREFIX + volume
+ OzoneConsts.KSM_KEY_PREFIX + bucket
+ OzoneConsts.KSM_KEY_PREFIX;}}
it might be a good idea to have 3 functions.
* getBucketWithDBPrefix
* getKeyWithDBPrefix
* getVolumeWithDBPrefix 
and just reuse that everywhere. It will avoid mistakes when we edit code later 
since all of these places need to be edited for any change in future.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178060#comment-16178060
 ] 

Yiqun Lin commented on HDFS-12506:
--

Thanks for the updating and sharing data, [~cheersyang]! This improvement 
tested looks very nice. Two comments:

* Please fix the javadoc wanrings.
* {{getSequentialRangeKVs}} can also make sense in {{listKeys}}. Can you also 
replace this in {{KSMMetadataManagerImpl#listKeys}}? We can stop looking up 
once we list all the keys under given bucket.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178045#comment-16178045
 ] 

Weiwei Yang commented on HDFS-12506:


Hi [~linyiqun]

Could you please help to review v6 patch, thanks.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177936#comment-16177936
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
24s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
14s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
50s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 3 new + 10 
unchanged - 0 fixed = 13 total (was 10) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
29s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 58s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888718/HDFS-12506-HDFS-7240.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 360facf34f75 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / bf08dc3 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21326/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21326/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.t

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177883#comment-16177883
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
57s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
4s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
11s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
30s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888712/HDFS-12506-HDFS-7240.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e750081ed5ce 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 16dd69a |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21323/artifact/patchprocess

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177866#comment-16177866
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
58s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
49s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
33s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.TestAclsEndToEnd |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888711/HDFS-12506-HDFS-7240.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8864d8f28db5 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 16dd69a |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21322/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21322/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177865#comment-16177865
 ] 

Weiwei Yang commented on HDFS-12506:


Found one more issue in this patch, it was not enough to fix this issue. I have 
re-populated 3 million keys with the patch and listBucket is still slow. The 
reason was because {{MetadataStore#getRangeKVs}} doesn't stop looking for 
buckets even the keys with the bucket prefix are all iterated. I just uploaded 
v6 patch to fix this.

v6 patch added another API in {{MetadataStore}}, see java doc

{code}
/**
   * This method is very similar with
   * {@link #getRangeKVs(byte[], int, MetadataKeyFilter...)}, the only
   * different is this method is supposed to return a sequential range
   * of elements based on the filters. While iterating the elements,
   * if it met any entry that cannot pass the filter, the iterator will stop
   * from this point without looking for next match. If no filter is given,
   * this method behaves just like
   * {@link #getRangeKVs(byte[], int, MetadataKeyFilter...)}.
   *
   * @param startKey
   * @param count
   * @param filters
   * @return
   * @throws IOException
   * @throws IllegalArgumentException
   */
  List> getSequentialRangeKVs(byte[] startKey,
  int count, MetadataKeyFilter... filters)
  throws IOException, IllegalArgumentException;
{code}

Since buckets are sorted, it should be retrieved by this 
{{getSequentialRangeKVs}} to avoid unnecessary look ups  to improve the 
performance. Tested on my cluster, with latest v6 patch, time consumed for the 
listBucket call is around *950ms*.

Please help to review. Thanks!

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch, 
> HDFS-12506-HDFS-7240.006.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-23 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177811#comment-16177811
 ] 

Weiwei Yang commented on HDFS-12506:


Thanks [~linyiqun] for the review, I have uploaded v5 patch that added test 
case to creating volume/bucket with prefix "#". Regarding to your first 
comment, about {{KeyType.UNKNOWN}}, I don't think that is necessary. We have 
desired format of KSM DB keys, if the key is something else and it is treated 
as key and then handed by

{code}
BucketInfo bucketInfo = BucketInfo.parseFrom(value);
{code}

protobuf will throw an exception that parse failed because format not match. 
Even we added logic to get a type for unknown, here it still fails but just 
with a different error message "Unknown key from ksm.db". That's why I think we 
don't need to calculate number of "/"s. A simple protobuf message is 
informative enough. Let me know if you disagree. Thanks.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch, 
> HDFS-12506-HDFS-7240.004.patch, HDFS-12506-HDFS-7240.005.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-22 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176288#comment-16176288
 ] 

Yiqun Lin commented on HDFS-12506:
--

Thanks for the updating, [~cheersyang]! The latest patch looks good now. Just 
found one nit:
{noformat}
-} else {
-  int count = key.length() - key.replace(KSM_VOLUME_PREFIX, "").length();
-  // NOTE : when delimiter gets changed, will need to change this part
-  if (count == 1) {
-return KeyType.VOLUME;
-  } else if (count == 2) {
-return KeyType.BUCKET;
-  } else if (count >= 3) {
-return KeyType.KEY;
-  } else {
-return KeyType.UNKNOWN;
-  }
+} else if (key.startsWith(KSM_VOLUME_PREFIX)) {
+  return key.replaceFirst(KSM_VOLUME_PREFIX, "")
+  .contains(KSM_BUCKET_PREFIX) ? KeyType.BUCKET : KeyType.VOLUME;
+}else {
+  return KeyType.KEY;
 }
{noformat}
Here the check for type {{KeyType.UNKNOWN;}} is missing. And one another 
suggestion: can you add an UT that creating a volume/bucket with prefix char 
"#"? I think method {{OzoneUtils#verifyResourceName}} already has done this 
check, so just add a test to confirm this.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch, HDFS-12506-HDFS-7240.003.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176138#comment-16176138
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
30s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
47s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs-client in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
41s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
39s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.ozone.scm.node.TestQueryNode |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888444/HDFS-12506-HDFS-7240.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 19cb6aa3e972 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 
11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/per

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175956#comment-16175956
 ] 

Weiwei Yang commented on HDFS-12506:


Failed UT were related, investigating ...

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175152#comment-16175152
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 18m 
20s{color} | {color:red} root in HDFS-7240 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
46s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-hdfs-client in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
28s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.ksm.TestKeySpaceManager |
|   | hadoop.ozone.ozShell.TestOzoneShell |
|   | hadoop.ozone.ksm.TestBucketManagerImpl |
|   | hadoop.ozone.web.client.TestBuckets |
|   | hadoop.ozone.scm.TestAllocateContainer |
|   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
|   | hadoop.ozone.ksm.TestKSMSQLCli |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888318/HDFS-12506-HDFS-7240.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4875ad3fea1d 3.13.0-123-generic #172-

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174901#comment-16174901
 ] 

Weiwei Yang commented on HDFS-12506:


Thanks [~linyiqun], [~nanda] and [~msingh] for all the comments, I have just 
uploaded v2 patch to address the feedback. See details below

bq. Should be list all the format of Metadata key for volume, bucket and key in 
here ? This will help in readability of the code.

Do you mean to add more doc to explain KSM DB schema? If that is the case, I 
have added some doc in {{OzoneConsts}}, hope that is helpful.

bq. should isBucketEmpty be similar to isVolumeEmpty ?

Yes, they can be similar. I have modified the patch a bit to do so.

bq. Created HDFS-12525 to verify volume/bucket name in OzoneClient.

Thanks for adding a task to track this.

Let me know if I miss anything. Thanks all.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch, 
> HDFS-12506-HDFS-7240.002.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174509#comment-16174509
 ] 

Yiqun Lin commented on HDFS-12506:
--

[~cheersyang], also as [~nandakumar131] commented, the checking behaviour for 
volume/bucket name is handled in {{OzoneUtils#verifyResourceName}}. We may add 
a new checking condition in this method that bucket or volume name cannot start 
with "#".

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174503#comment-16174503
 ] 

Nandakumar commented on HDFS-12506:
---

Created HDFS-12525 to verify volume/bucket name in OzoneClient.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174502#comment-16174502
 ] 

Weiwei Yang commented on HDFS-12506:


Hi [~linyiqun]

Thanks for the comments

bq. Why here is store.peekAround(0, dbVolumeRootKey);? Actually we should find 
the right key of dbVolumeRootKey(/#vol/#) and use store.peekAround(1, 
dbVolumeRootKey), right?

I think both peek 0 or 1 can work but 0 is slightly easier so I use this 
approach, let me review this part of code again to see if there if I can make 
it simpler. In the past we have more checks because keys order is different 
with it now.

bq. In addition, the following failed UT seem related.

This jenkins job was testing an incorrect patch, the first patch I uploaded 
missed the changes to {{OzoneConsts}} that's why they were failing. I deleted 
that and re-uploaded the patch. Lets wait for a new jenkins job result on the 
real v1 patch.

Thanks

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174501#comment-16174501
 ] 

Nandakumar commented on HDFS-12506:
---

[~cheersyang], [~linyiqun]

bq. With such prefix definition, we will need to avoid user from adding volumes 
with name like "#volumeName", bucket with name like "#bucketName".

We don't support any special character other than "." and "-" in bucket/volume 
name, it is handled in {{OzoneUtils#verifyResourceName}}.
This is not the case with new OzoneClient implementation (HDFS-12385), I will 
file a jira to add this behavior in it. Thanks for bringing it up.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Mukul Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174499#comment-16174499
 ] 

Mukul Kumar Singh commented on HDFS-12506:
--

Thanks for working on this [~cheersyang], the patch looks really good to me, 
some minor nitpicks though

1) OzoneConsts.java: Should be list all the format of Metadata key for volume, 
bucket and key in here ? This will help in readability of the code.
2) KSMMetadataManagerImpl.java:245, should isBucketEmpty be similar to 
isVolumeEmpty ? I feel we can append the OzoneConsts.KSM_KEY_PREFIX to 
keyRootName, this will make the code similar in both the functions


> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174430#comment-16174430
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 17m 
37s{color} | {color:red} root in HDFS-7240 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 has 
1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-hdfs-client in HDFS-7240 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
43s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
39s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
37s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 54s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.ozShell.TestOzoneShell |
|   | hadoop.ozone.ksm.TestBucketManagerImpl |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.ozone.web.client.TestBuckets |
|   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
|   | hadoop.ozone.ksm.TestKSMSQLCli |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888216/HDFS-12506-HDFS-7240.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linu

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174418#comment-16174418
 ] 

Yiqun Lin commented on HDFS-12506:
--

Thanks for working on this improvement, [~cheersyang]!
One comment for your patch:
{noformat}
   public boolean isVolumeEmpty(String volume) throws IOException {
-String dbVolumeRootName = OzoneConsts.KSM_VOLUME_PREFIX + volume;
+String dbVolumeRootName = OzoneConsts.KSM_VOLUME_PREFIX + volume
++ OzoneConsts.KSM_BUCKET_PREFIX;
 byte[] dbVolumeRootKey = DFSUtil.string2Bytes(dbVolumeRootName);
-// Seek to the root of the volume and look for the next key
-ImmutablePair volumeRoot =
-store.peekAround(1, dbVolumeRootKey);
-if (volumeRoot != null) {
-  String firstBucketKey = DFSUtil.bytes2String(volumeRoot.getKey());
-  return !firstBucketKey.startsWith(dbVolumeRootName
-  + OzoneConsts.KSM_BUCKET_PREFIX);
+ImmutablePair
+firstBucket = store.peekAround(0, dbVolumeRootKey);
+if (firstBucket != null) {
+  String firstBucketKey = DFSUtil.bytes2String(firstBucket.getKey());
+  return !firstBucketKey.startsWith(dbVolumeRootName);
 }
{noformat}
Why here is {{store.peekAround(0, dbVolumeRootKey);}}? Actually we should find 
the right key of dbVolumeRootKey(/#vol/#) and use {{store.peekAround(1, 
dbVolumeRootKey)}}, right?
In addition, the following failed UT seem related.
 org.apache.hadoop.ozone.ksm.TestKeySpaceManager.testDeleteNonEmptyBucket
 org.apache.hadoop.ozone.web.client.TestKeys.testDeleteKey

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
> Attachments: HDFS-12506-HDFS-7240.001.patch
>
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174407#comment-16174407
 ] 

Hadoop QA commented on HDFS-12506:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 15m  
2s{color} | {color:red} root in HDFS-7240 failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-hdfs in HDFS-7240 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 21s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.ozone.ksm.TestKeySpaceManager |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.ozone.web.client.TestKeys |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12506 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888213/HDFS-12506-HDFS-7240.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 78a968ec2e80 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-7240 / 244e7a5 |
| Default Java | 1.8.0_144 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21265/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.1.0-RC1 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21265/artifact/patchprocess/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21265/artifact/patchprocess/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21265/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21265/testReport/

[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174215#comment-16174215
 ] 

Weiwei Yang commented on HDFS-12506:


Hi [~xyao], [~nandakumar131]

Thanks for contributing your ideas, this approach should work but there is one 
thing I need to point out. With such prefix definition, we will need to avoid 
user from adding volumes with name like "#volumeName", bucket with name like 
"#bucketName". That will cause problems

If user adds a volume *#v1*, a bucket is added

/#v1/b1

this will confuse KSM to think this is a volume with name *#v1/b1*.



> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge, performance
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173455#comment-16173455
 ] 

Anu Engineer commented on HDFS-12506:
-

[~cheersyang], Very good find. You are right this approach won't scale.  

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173440#comment-16173440
 ] 

Xiaoyu Yao commented on HDFS-12506:
---

We should keep a common prefix for all higher level containers like (volume, 
bucket) to avoid mixing the prefix with objects. 
If we don't have /#v1/#b1, list volume will still have the same overhead of 
iterating all the keys like we had for bucket case here.



> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Nandakumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173430#comment-16173430
 ] 

Nandakumar commented on HDFS-12506:
---

+1 for [~xyao]'s idea, I was also thinking of the same.
One small change though
For Volume
/#v1
For Bucket
/v1/#b1
Keys can be stored as they are stored now

With this we can iterate and get list of volumes without iterating over 
buckets, and get list of buckets without iterating over keys.

Something lime
{code}
/#v1
/#v2
/#v3
/v1/#b1
/v1/#b2
/v2/#b1
/v3/#b1
/v1/b1/k1
/v2/b2/k2
{code}



> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173408#comment-16173408
 ] 

Xiaoyu Yao commented on HDFS-12506:
---

Thanks [~cheersyang] for reporting this. An easy fix might be assigning a 
different prefix for the volume, bucket object key itself. Example,

For volume in your example will be keyed like
/#v1

For bucket in your example will be keyed like
/#v1/#b1

A regular key be keyed as-is today without the special prefix:
/v1/b1/k1

This way, if you want to just list volume or bucket, it will not be affected by 
how many objects contained. With some minor changes in the KSM MetadataManager, 
we should be able handle this with better performance. Let me know your 
thoughts.



> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12506) Ozone: ListBucket is too slow

2017-09-20 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172866#comment-16172866
 ] 

Weiwei Yang commented on HDFS-12506:


Open this JIRA for discussion, +[~anu], [~xyao], [~vagarychen], [~msingh], 
[~nandakumar131] to the loop. One quick thought, maybe we can consider to 
leverage [rocksDB column 
family|https://github.com/facebook/rocksdb/wiki/Column-Families], to create CFs 
for user, volume, bucket and keys. But that results in a refactor of KSM DB.

> Ozone: ListBucket is too slow
> -
>
> Key: HDFS-12506
> URL: https://issues.apache.org/jira/browse/HDFS-12506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Weiwei Yang
>Priority: Blocker
>  Labels: ozoneMerge
>
> Generated 3 million keys in ozone, and run {{listBucket}} command to get a 
> list of buckets under a volume,
> {code}
> bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
> {code}
> this call spent over *15 seconds* to finish. The problem was caused by the 
> inflexible structure of KSM DB. Right now {{ksm.db}} stores keys like 
> following
> {code}
> /v1/b1
> /v1/b1/k1
> /v1/b1/k2
> /v1/b1/k3
> /v1/b2
> /v1/b2/k1
> /v1/b2/k2
> /v1/b2/k3
> /v1/b3
> /v1/b4
> {code}
> keys are sorted in nature order so when we do list buckets under a volume e.g 
> /v1, we need to seek to /v1 point and start to iterate and filter keys, this 
> ends up with scanning all keys under volume /v1. The problem with this design 
> is we don't have an efficient approach to locate all buckets without scanning 
> the keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org