[ 
https://issues.apache.org/jira/browse/HBASE-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973831#comment-13973831
 ] 

ramkrishna.s.vasudevan commented on HBASE-10801:
------------------------------------------------

I tested this patch with a minor modification of not passing the SeekerState to 
the KeyOnlyClonedSeekerState to have only the primitive member variables.  
(passing seekerstate was bit more costly).
Combining this with HBASE-10929  and added a filter FilterAllFilter, that 
filters out every row that gets returned to the client.  This ensures that the 
path of the scan there is no need for creating a KV object (which involves 
copying the value part also).  So purely the comparison happens as only Cells.  
Note that in this patch the key part is copied in the shallowCopy().
Doing so with a full table scan with 1 thread over 2000000 rows resulted in 
this 
With patch
========
{code}
hbase(main):002:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 9.6820 seconds

hbase(main):003:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.8490 seconds

hbase(main):004:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.7680 seconds

hbase(main):005:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.5470 seconds
{code}

without patch
=========
{code}
hbase(main):002:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 19.4020 seconds

hbase(main):003:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 6.1450 seconds

hbase(main):004:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.8520 seconds

hbase(main):005:0> scan 
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.6900 seconds
{code}
Used Performance Evaluation tool.  So the length of value bytes is 1000 per 
row.  So you could see when the experiment starts the scan almost takes 50% 
more time.  But once the cache is fully loaded the scans are not too costly and 
the values even out with a small deviation. Changing the value size may impact 
much more than this.
Can test with changing the value also and making it much more bigger.
This change in the performance during the first scanning remains consistent.

> Ensure DBE interfaces can work with Cell
> ----------------------------------------
>
>                 Key: HBASE-10801
>                 URL: https://issues.apache.org/jira/browse/HBASE-10801
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.99.0
>
>         Attachments: HBASE-10801.patch, HBASE-10801_1.patch, 
> HBASE-10801_2.patch, HBASE-10801_3.patch
>
>
> Some changes to the interfaces may be needed for DBEs or may be the way it 
> works currently may be need to be modified inorder to make DBEs work with 
> Cells. Suggestions and ideas welcome.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to