[ 
https://issues.apache.org/jira/browse/HBASE-21520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720002#comment-16720002
 ] 

Zheng Hu commented on HBASE-21520:
----------------------------------

Because we have 10 (NUM_FLUSHES=10)  hfiles here,  and the table will put ~1000 
cells ( rows=20, ts=6, qualifiers=8, total=20*6*8 ~ 1000) . Each full table 
scan will  check the ROWCOL bloom filter 20 (rows)* 8 (column) * 10 (hfiles)= 
1600 times.   we consider the avg full table scan cost  50ms , then each bloom 
filter calculation cost  50 (ms)/ 1600.0 = 0.031 ms ... 

> TestMultiColumnScanner cost long time when using ROWCOL bloom type
> ------------------------------------------------------------------
>
>                 Key: HBASE-21520
>                 URL: https://issues.apache.org/jira/browse/HBASE-21520
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: HBASE-21520.v1.patch, TestMultiColumnScanner.png, 
> rowcol.txt
>
>
> The TestMultiColumnScanner is easy to be timeout,  you can see HBASE-21517.   
> In my localhost,  when I set the parameters to be { 
> Compression.Algorithm.NONE, BloomType.ROW, false },  it took about 5 seconds. 
>  but if I set the parameters to be  { Compression.Algorithm.NONE, 
> BloomType.ROWCOL, false },  it would take about 45 seconds, which means 
> ROWCOL cost much more time than ROW.
> Need to find out what's wrong with this ut.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to